Annotations of Transcript (159)

Id Begin End Content
a1070 00:00:19.041 00:00:29.116 Time flies, It's actually almost twenty years ago when I wanted to reframe the way we use information, the way we work together, I (??) it the World Wide Web.
a1071 00:00:29.116 00:00:36.223 Now, twenty years on, TED, I want to ask your help in a new reframement.
a1072 00:00:36.223 00:00:44.701 So going back to 1989, I wrote a memo suggesting global hypertext system.
a1073 00:00:44.701 00:00:47.822 Nobody really did anything with it very much.
a1074 00:00:47.822 00:00:59.507 But eighteen months later - this is, you know, this is how innovation happens, eighteen months later my boss said I could do it on-on a side, as a sort of a play(?) project, ehr, (??) on the computer we'd got.
a1075 00:00:59.507 00:01:02.413 And so he gave me the time to code it up.
a1076 00:01:02.413 00:01:15.334 So, I basically roughed out what HTML looks like, the hypertext protocol, HTTP, the idea if URLs, these names for things which sorted(?) HTTP.
a1077 00:01:15.334 00:01:18.001 I wrote the code, and put it out there.
a1078 00:01:18.001 00:01:21.221 Why did I do it? Well, it was basically frustration.
a1079 00:01:21.221 00:01:29.999 I was frustrated with- I was in this- I was working as a software engineer in this huge very exciting lab. Lots of people coming from all over the world.
a1080 00:01:29.999 00:01:38.111 They (??) all sorts of different communities(?) with them, they had all sort of different data formats, all sorts of kinds of documentation systems.
a1081 00:01:38.111 00:01:51.096 So that, in all that diversity, if I wanted to figure out how to build something out of one little bit of this and a bit of this, everything I looked into, I had to connect to some new machine, I had to learn to run some new program.
a1082 00:01:51.096 00:01:57.707 I had to- I would find the data may be, the information I wanted, in some new data format and they were all com- incompatible.
a1083 00:01:57.707 00:02:01.912 It was just very frustrating, the frustration was on this- all this unlocked potential.
a1084 00:02:01.912 00:02:04.689 In fact on all these disks, there were documents.
a1085 00:02:04.689 00:02:16.990 So, if you just imagine they all being part of some big virtual documentation system in the sky, then- say, on the internet, then life would be so much easier.
a1086 00:02:16.990 00:02:29.203 Well, once you have an idea like that, it kinds of gets under your skin, and even if people don't read your memo (actually he did, it was found after he died, his copy, it was found and he'd written "vague but exciting" in pencil in the corner).
a1087 00:02:29.203 00:02:38.477 But in general, it was difficult to exp- it's really difficult to explain what the Web was like, you don't- it's difficult to explain to people now, that(?) it was difficult then.
a1088 00:02:38.477 00:02:41.770 But then, OK, when TED started, there was no Web.
a1089 00:02:41.770 00:02:44.669 So we- things like clicked in(?) have(?) the same meaning.
a1090 00:02:44.669 00:02:52.243 I could show somebody a piece of hypertext, a page which has got some links, and we click on a link and *bing*, there will be another hypertext page.
a1091 00:02:52.243 00:02:58.148 Not impressive, you know, we've seen that, we've got things on the hypertext on CD-ROMs.
a1092 00:02:58.148 00:03:06.800 What was difficult was to get them to imagine. So imagine that that link could have gone to virtually any document you could imagine.
a1093 00:03:06.800 00:03:13.156 All right? That is the- that is the leap that was very difficult for people to make. Well, some people did.
a1094 00:03:13.056 00:03:16.718 So yes, it was difficult to explain, but it was a grassroots movement.
a1095 00:03:16.718 00:03:28.887 And that is what made it has made it most of- most fun. That was the most exciting thing, not the technology, not the things people'd done with it, but actually the community, the spirit of all these people getting together, sending e-mails.
a1096 00:03:28.887 00:03:34.029 That's what it was like, then. Do you know what, it's funny but right now it's kind of like that again.
a1097 00:03:34.029 00:03:40.000 I asked everybody more or less to put their documents, say "Could you put your documents on this Web thing."
a1098 00:03:40.000 00:03:43.412 And you did, thanks.
a1099 00:03:43.512 00:03:56.844 It were- it's been a blast, hasn't it. I mean, it's- it's been quite interesting because we found out that the things that happened with the Web really blew(?) us away. They're much more than we'd eventually(?) imagined, when we put together the little web- you know, the initial website that we started off with.
a1100 00:03:56.844 00:04:00.018 Now, I want you to put your data on the Web.
a1101 00:04:00.018 00:04:10.825 Turns out that there is still huge unlocked potential. There is still a huge frustration that people have because we haven't got data on the Web as data.
a1102 00:04:10.825 00:04:13.339 What do you mean, "data", what's the difference, documents, data?
a1103 00:04:13.339 00:04:15.630 Well documents you read, OK?
a1104 00:04:15.630 00:04:18.469 More or less, you can read them, you can put a link from them and that's it.
a1105 00:04:18.469 00:04:20.976 Data, you can do all kinds of stuffs with the computer.
a1106 00:04:20.976 00:04:26.755 Who was here or, don't know, has seen Hans Rosling's talk.
a1107 00:04:26.755 00:04:32.399 When Hans Rosling was at Ted, yeah, one of the- great, yes, a lot of people has seen it, cause it was one of the greatest Ted's talks.
a1108 00:04:32.399 00:04:45.854 Hans put up this presentation in which he shows, for various different countries in various different colours, he shows income level on one axis and he showed infant mortality, and he showed this thing animated from time.
a1109 00:04:45.854 00:04:56.448 So he'd taken this data, made a presentation which just shattered a lot of myths that people have about the economics in the developing world.
a1110 00:04:56.448 00:04:59.064 He put up a slide a little bit like this.
a1111 00:04:59.064 00:05:00.841 It had underground all the data.
a1112 00:05:00.841 00:05:05.836 OK, data is brown and boxy and boring and all that(?), that's what we think of it, isn't it, data?
a1113 00:05:05.836 00:05:08.745 Cause data you can't naturally use by itself.
a1114 00:05:08.745 00:05:22.614 But in fact data drives a huge amount of what happens in our lives. It happens because somebody takes that data and does something with it. In this case Hans, he could put the data together, he found from all kinds of United Nation websites and things.
a1115 00:05:22.614 00:05:27.723 He put it together, combined it into something more interesting than the original pieces.
a1116 00:05:27.723 00:05:37.956 And then he put it into this software, which I think is Sun developed originally, and produces this wonderful presentation.
a1117 00:05:37.956 00:05:50.732 And Hans made a point of saying it's really important to have a lot of data, and I'm happy to see, the party last night, that he was still saying very forcibly, it's really important to have a lot of data.
a1118 00:05:50.732 00:06:04.821 So I want us now to think about, not just two pieces of data being connected, or six like he did, but I want to think of about a world where everybody has put data on the Web, and so virtually anything you could imagine is on the Web, and I'm calling that Linked Data.
a1119 00:06:04.821 00:06:07.777 The technology is Linked Data, and it's extremely simple.
a1120 00:06:07.777 00:06:10.544 If you want to put something on the Web, there are three rules.
a1121 00:06:10.544 00:06:32.503 First thing is, that those HTTP names, those things that start with "http:", we're using them not just for documents, now we're using them for things that the documents are about. We're using them for people, we're using them for places. We're using them for your products. We're using them for events. All kinds of conceptual things they star- they have names now, that start with "http".
a1122 00:06:32.503 00:06:55.584 Second rule: when- if I take one of these "http" names and I look it up, I go and do the Web thing with it, I fetch the data using the HTTP protocol from the Web, I will get back some data in a standard format which is kind of useful data somebody might like to know about that thing, about that event, who's at the event, whatever it is about that person, where they were born, things like that.
a1123 00:06:55.584 00:06:58.185 So, second rule is: I get important information back.
a1124 00:06:58.185 00:07:06.838 Third rule is that when I get back this information, it's not just got somebody's height and weight and when they were born, it's got relationships.
a1125 00:07:06.838 00:07:10.482 Data is relationships. Interestingly, data is relationships.
a1126 00:07:10.482 00:07:24.444 It's got this person was born in Berlin, Berlin is in Germany, and when it has relationships, whatever expresses this relationship, then the other thing that it's related to is given a na- one of those names that starts "http".
a1127 00:07:24.444 00:07:36.219 So I can go ahead and look that thing out. So I look up a Person, I can look up then the city where they were born, then I can look up the region it's in, and the town it's in and the population of it, and so on, so I can browse this stuff.
a1128 00:07:36.219 00:07:40.849 So that's it really. That is Linked Data.
a1129 00:07:40.849 00:07:48.655 I wrote an article entitled "Linked Data" a couple of years ago, and soon after that, things started to happen.
a1130 00:07:48.655 00:07:57.036 The idea of Linked Data is that we get lots an lots and lots of these boxes that Hans had, and we get lots and lots and lots of things sprouting.
a1131 00:07:57.036 00:08:01.917 It's not just an whole lot of other plants, it's not just a root supplying a plant.
a1132 00:08:01.917 00:08:16.654 But for each of those plants, whatever it is, a presentation, an analysis, somebody's looking for patterns in the data, they get to look at all the data and they get it connected together, and the really important thing about data is that the more things you have to connect together, the more powerful it is.
a1133 00:08:16.654 00:08:20.680 So, Linked Data, the mean went out there.
a1134 00:08:20.680 00:08:27.114 And pretty soon Chris Bizer at the Freie Universität in Berlin was one of the first people to put interesting things up.
a1135 00:08:27.114 00:08:40.545 He noticed that Wikipedia, you know Wikipedia, the online encyclopedia with lots and lots of interesting documents in it, well in those documents, there are little squares, little boxes and those- in those information boxes, there's data.
a1136 00:08:40.545 00:08:49.365 So he wrote a program to take the data, extract it from Wikipedia and put it into a blob of linked data on the Web; which he called dbpedia.
a1137 00:08:49.365 00:08:53.683 Dbpedia is represented by the blue blob in the middle of this slide.
a1138 00:08:53.683 00:09:00.386 And if you actually go and look at Berlin you'll find that there are other blobs of data which also have stuff about Berlin and they are linked together.
a1139 00:09:00.386 00:09:08.397 So if you pull the data from dbpedia about Berlin, you'll end up pulling up these other things as well. And the exciting thing is: it's starting to grow.
a1140 00:09:08.397 00:09:10.722 This is just a grassroots stuff again, OK?
a1141 00:09:10.722 00:09:13.496 Now let's thing about data (??).
a1142 00:09:13.496 00:09:16.814 Data comes in fact in lots and lots of different forms.
a1143 00:09:16.814 00:09:22.265 Think of the diversity of the Web. It's a really important thing that the Web allows you to put all kinds of data up there.
a1144 00:09:22.265 00:09:25.038 So it is with data. I can talk about all kinds of data.
a1145 00:09:25.038 00:09:29.976 We can talk about government data, enterprise data is really important.
a1146 00:09:29.976 00:09:32.681 There's scientific data, there's personal data.
a1147 00:09:32.681 00:09:34.880 There's weather data, there's data about events.
a1148 00:09:34.880 00:09:38.811 There's data about talks, and there's news, and there's all kinds of stuff.
a1149 00:09:38.811 00:09:47.205 I'm just going to mention a few of them, so that you get the idea of the diversity of it, so that you also see how much unlocked potential.
a1150 00:09:47.205 00:09:48.292 Let's start with government data.
a1151 00:09:48.292 00:09:58.065 Barak Obama said in a speech that he- the American government data would be available on the internet in accessible formats.
a1152 00:09:58.065 00:10:00.904 And I hope that they will put it out as linked data.
a1153 00:10:00.904 00:10:02.362 That's important.
a1154 00:10:02.362 00:10:05.963 Why is it important? Not just for transparency. Yes, transparency in government's important.
a1155 00:10:05.963 00:10:16.416 But that data, this is the data from all the government departments. Think about how much of that data is about how life is lived in America. It's actually useful, it's got value. I can use it in my company.
a1156 00:10:16.416 00:10:18.328 I could use it as a kid to do my homework.
a1157 00:10:18.328 00:10:25.059 So we're talking about making the place, making the world run better by making this data available.
a1158 00:10:25.059 00:10:36.117 In fact if you're responsible, if you know about some data in a government department, often you find that these people, they're very tempted to keep it, to (??) in database hugging.
a1159 00:10:36.117 00:10:40.856 You hug your database, you don't want to let it go until you've made a beautiful website for it.
a1160 00:10:40.856 00:10:46.963 Well I'd like to suggest that rath- before you- yes, make a beautiful website (who am I to say "don't make a beautiful website").
a1161 00:10:46.963 00:10:53.293 Make a beautiful website, but first, give us the unadulterated data. We want the data.
a1162 00:10:53.293 00:10:56.050 We want unadulterated data. OK.
a1163 00:10:56.050 00:11:01.280 We have to ask for raw data now, and I'm gonna ask you to practice that, OK?
a1164 00:11:01.280 00:11:03.635 Can you say "raw"?
a1165 00:11:03.635 00:11:05.136 Can you say "data"?
a1166 00:11:05.136 00:11:06.544 Can you say "now"?
a1167 00:11:06.544 00:11:10.979 Right: "raw data now".
a1168 00:11:10.979 00:11:21.149 Practice that, it's important, because you have no idea the number of excuses people come up with to hang on to their data, and not give it to you, even though you've paid for it as a taxpayer.
a1169 00:11:21.149 00:11:23.314 And it's not just America, it's all over the world.
a1170 00:11:23.314 00:11:26.421 That is not just not just governments, of course it's enterprises as well.
a1171 00:11:26.421 00:11:29.384 So I'm just going to mention a few other sources of data.
a1172 00:11:29.384 00:11:39.165 Well here we are, Ted, and all the time we are very conscious of the huge challenges that human society has right now.
a1173 00:11:39.165 00:11:40.130 Curing cancer.
a1174 00:11:40.130 00:11:42.561 Understanding the brain for Alzheimer's.
a1175 00:11:42.561 00:11:45.140 Understanding economics, making it a little more stable.
a1176 00:11:45.140 00:11:46.726 Understanding how the world works.
a1177 00:11:46.726 00:11:51.534 The people who are gonna solve those are scientists, they have hard formed ideas in their head.
a1178 00:11:51.534 00:12:03.355 They try to communicate of those over the Web, but a lot of the state of knowledge of the human race at the moment is on databases, often sitting in their computers and actually commonly not shared.
a1179 00:12:03.355 00:12:05.925 In fact, I'm just going to one area:
a1180 00:12:05.925 00:12:16.331 if you're looking at Alzheimer's for example, drug discovery, there is an whole lot of linked data which is just coming out because scientists in that field realize this is a great way of getting out of those silos.
a1181 00:12:16.331 00:12:21.503 Because they had that genomic data in one database and in one building.
a1182 00:12:21.503 00:12:30.856 And they had that protein data in another. Now they are sticking it onto it: Linked data. And now they can ask a question, a question that you probably wouldn't ask, I wouldn't ask, they would:
a1183 00:12:30.856 00:12:35.779 "What proteins are involved in signal transduction and also are related to pyramidal neurons?"
a1184 00:12:35.779 00:12:43.398 Well you take that (??) and if you put it to google, of course there is no page on the web which would answer that question because nobody has asked that question before.
a1185 00:12:43.398 00:12:47.309 You get 223,000 hits: no result you can use.
a1186 00:12:47.309 00:12:55.336 You ask the Linked Data which they've now put together: 32 hits, each of which is a protein which has these properties, and you can look at.
a1187 00:12:55.336 00:13:04.299 The power of being able to ask those questions of a scientist, those questions which actually bridge across different disciplines is really a complete (??) change.
a1188 00:13:04.299 00:13:07.647 It's very very important. Scientists have totally (??) at the moment there(?).
a1189 00:13:07.647 00:13:17.028 The power of the data that other scientists have collected is locked up and we need to get it unlocked so we tackle those huge problems.
a1190 00:13:17.028 00:13:23.553 Now, if I go on like this you'll think that all the data comes from huge institutions, and it has nothing to do with you.
a1191 00:13:23.553 00:13:25.018 But that's not true.
a1192 00:13:25.018 00:13:27.927 In fact data is about our lives.
a1193 00:13:27.927 00:13:34.573 You just- you logon to your social networking site, you pick your favourite one, you say "this is my friend", *bing*, relationship, data.
a1194 00:13:34.573 00:13:39.772 You say "this photograph, oh, it's about- it depicts this person", *bing*, that's data.
a1195 00:13:39.772 00:13:46.686 Data data data. Everytime you do things in a social networking site, the social networking site is taking data and using it, repurposing it.
a1196 00:13:46.686 00:13:51.824 And using it to make other people's lives more interesting on the site.
a1197 00:13:51.824 00:14:00.624 But when you go to another Linked Data site, and you say this one about travel, and you say "I want to sent this photo to all the people in that group", you can't get over the walls.
a1198 00:14:00.624 00:14:04.782 The Economist wrote an article about it, lots of people blogged about it, tremendous frustration.
a1199 00:14:04.782 00:14:10.600 The way to break down the silos to get interoperability between social networking sites, we need to do that with Linked Data.
a1200 00:14:10.600 00:14:16.509 One last type of data I will talk about, may be it's the most exciting, before I came down here I looked up on the OpenStreetMap.
a1201 00:14:16.509 00:14:18.430 OpenStreetMap is a map, but it's also a wiki.
a1202 00:14:18.430 00:14:22.472 Zoom in and that's square thing is the theatre which we're in right now, the Terrace Theatre.
a1203 00:14:22.472 00:14:23.659 It didn't have a name on it.
a1204 00:14:23.659 00:14:26.230 So I could go in Edit mode, I could select the theatre.
a1205 00:14:26.230 00:14:28.566 I could add on down the bottom the name.
a1206 00:14:28.566 00:14:36.645 And then I could save it back, and now if you go back to the openstreetmap.org, and you find this place, you will find that the Terrace Theatre's got a name.
a1207 00:14:36.645 00:14:37.897 I did that, me.
a1208 00:14:37.897 00:14:39.339 I did that on the map.
a1209 00:14:39.339 00:14:41.548 I just did that, I put that up on there and you know what?
a1210 00:14:41.548 00:14:51.149 If I- the StreetMap is all about everybody doing their bit, and this creates an incredible resource because everybody else does theirs.
a1211 00:14:51.149 00:14:54.005 And that is what Linked Data is all about.
a1212 00:14:54.005 00:15:00.983 It's about people doing their bit to produce a little bit, and it all connecting.
a1213 00:15:00.983 00:15:03.634 That's how Linked Data works.
a1214 00:15:03.634 00:15:07.636 But you do your bit, everybody else does this.
a1215 00:15:07.636 00:15:16.681 You may not have lots of data which you have to- yourself to put on there, but you know to demand it, and we've practiced that.
a1216 00:15:16.681 00:15:20.461 So, Linked Data is this huge.
a1217 00:15:20.461 00:15:22.920 I've only told you of a very small number of things.
a1218 00:15:22.920 00:15:28.702 There are data in every aspect of our lives, every aspect of work and pleasure, OK?
a1219 00:15:28.702 00:15:31.972 And it's not just about the number of places where data comes.
a1220 00:15:31.972 00:15:39.836 It's about connecting it together, and when you connect data together, you get power in a way that doesn't happen just with the Web, with documents.
a1221 00:15:39.836 00:15:44.271 You get this really huge power out of it.
a1222 00:15:44.271 00:15:48.803 So, we're at a stage now where we have to do this.
a1223 00:15:48.803 00:16:06.094 Those- the people who think it's a great idea. And all the people, and I think there are a lot of people at Ted, who do things, because even though there's not an immediate return on investment, you have- because it will only really pay off when everybody else has done it, they'll do it, because they're the sort of person who just does things which would be good if everybody else did them.
a1224 00:16:06.094 00:16:08.175 OK? So it's called Linked Data.
a1225 00:16:08.175 00:16:09.574 I want you to make it.
a1226 00:16:09.574 00:16:11.531 I want you to demand it.
a1227 00:16:11.531 00:16:14.122 And I think it's an idea worth spreading.
a1228 00:16:14.122 00:16:18.166 Thanks.