There are millions of news articles, blog posts, company profiles, and patents out there and we are increasingly facing an overload of information. The time we spend trying to parse the data to make informed decisions, but also to try to identify new connections and trends is growing exponentially. We talk to Dan Buczaczer from Quid about how machine intelligence is helping us interrogate the world’s collective intelligence.

Sandra Peter: There are millions of news articles, blog posts, company profile and patents out there with new ones being added every day. We are increasingly facing an overload of information - the time we spend trying to pass the data to make informed decisions but also trying to identify new connections and trends is growing exponentially. What's the future of food? Is Apple building the self-driving car? What are consulting companies thinking about these days? Recognised by the World Economic Forum in 2016 as a technology pioneer alongside block chain and slack, San Francisco based company Quid helps map, query and visually explore massive amounts of information and find new ways to navigate the world's collective intelligence.

Introduction: From the University of Sydney Business School, this is Sydney Business Insights, the podcast that explores the future of business.

Sandra: I'm Sandra Peter and today we talk to Dan Buczaczer from Quid about how machine intelligence is enabling us to extract intelligence from complex, unstructured text data sets and how pairing it with human intuition allows us to gain insights into what is happening in terms of investment, into new products and services, or into what people are writing and talking about. Dan Buczaczer is responsible for getting attention for Quid through a mixture of storytelling, press relationships, and experiences all designed to demonstrate the power of understanding text based data at scale.

Sandra: Welcome and thank you for talking to us.

Dan: Thank you.

Sandra: There are millions of news articles and blog posts and company profiles and patents out there. How can we identify connections and trends in this mass of increasingly complex information?

Dan: The goal at Quid, basically our mission is pairing machine intelligence with human intuition to enable organisations to make decisions that matter. And to make decisions that matter, those tend to be very complex, right, they're based on a complex amount of information. So rather than building a company that spits out an answer like a lot of other companies where you put in a lot of information and they sort of give you one answer. Our goal is basically to present that complexity but visualise connections and patterns within that data so you can start to make sense of all of that information. For example I've done analysis on what is the future of food. That's obviously a very broad question and there's a lot of different ways to answer that question. So when you put that into a Quid network you end up seeing a lot of different clusters. Those clusters are made up of individual pieces of information like you mentioned news articles, blog posts, reviews whatever you've put into that particular search. And those are represented as nodes that then cluster naturally based on similarities and language. So the example of what is the future of food you might have one cluster that's around funding activity with start ups that are focused on food in some way. There might be a news cluster around policy and politics in terms of feeding the world in GMOs in food safety. There might be another cluster – or there is another cluster in this case specifically – interested on a specific part of the industry like fast food and understanding how it's changing over time.

Another cluster might be new companies in the food space like grocery home delivery and meal kits subscription. And another one may be analysing ingredients that are becoming popular in restaurants that are gaining in popularity. We can dig into each of these sections of the network separately but we can also zoom back out and see how these topics connect with each other based on that common language. And so we might find that that changes in fast food cluster is heavily connected with both the clustering ingredients that are gaining in popularity as well as one that is focused on healthy foods. And you can understand the changes in fast food are being heavily influenced by those two. Most of the talk about changes in fast food have to do with changes in food getting healthier as well as the ongoing rising and falling of ingredients that are trendier.

Sandra: That sounds like a fascinating insight. And you've been recently recognised alongside Slack and Blockchain as a technology pioneer at the World Economic Forum in 2016. How is Quid contributing to the future of the Internet?

Dan: The Internet obviously is the largest trove of information that's ever existed by a long shot. And a lot of that information overwhelmingly is language. What Quid does is visualises and draws connections between similarities in language which is very different from a lot of data tools out there which do that with numbers.

Dan: But what you gain with language especially when you visualise it like Quid does is you gain context. You start to understand what are the connections between these various themes and narratives and all of these pieces of text that describe how we act, how we think, how we feel. And it allows you to draw these larger connections about the world and put all of that information on the Internet to really this brand new use. So that was really what we were recognised for with the Tech Pioneer Award. And it's really what we're trying to crack is there's so much information out there it can be so valuable but it's very hard to bring it together and make coherent sense of it.

Sandra: When you talk about language what kind of language might you look at with Quid?

Dan: Language is a data source. We have three feeds that we are pulling in every minute of every day consistently. One is news and blogs. So everything being written from more than 300,000 sources worldwide - that includes blogs, it includes major newspapers and magazines, it includes very small websites as well. The second type is company profile information so profiles of any company that has received private venture investment, we include information that comes from official company profiles that have been written about them as well as information from their websites so you don't just get the typical description of that company or the typical category that normally they're associated with. You might also get some really interesting other pieces based on the language they use to describe themselves. The third piece is patent information. So any patent that has been filed ever through the Patent Office is important to Quid.

Dan: Now on top of that we have a feature called Opus that allows you to upload any set of text data that you want. So we've had clients that have uploaded anything from reviews on review sites, to customer survey information from their own internal surveys, to information being posted on the web in forums whether it's for a gaming company or a pharmaceutical company.

Sandra: Can you walk us through some of the types of new questions that we can now ask with Quid?

Dan: When we're looking to answer these complex questions we're really focused on two main arenas we're trying to tackle. One is understanding markets and the other is understanding narrative. Understanding markets really has to do with what is happening in terms of investment, in terms of companies making moves hiring people, introducing new products, filing patents, gives us a sense of what's happening from a very business heavy perspective. The second one, understanding narrative, really helps us understand what are people writing about and talking about around a particular topic, whatever topic you're interested in. And how is that potentially shifting over time. And so within each of those there are some deeper examples we can go into. Within markets for example, it might be understanding an entirely new emerging market like telehealth, right, which is this idea of seeing a doctor over your phone, over the Internet somehow, and understanding who are the companies that are starting to appear in that space, who is investing in those companies, what large players are involved. Are there telecommunications companies involved in that space? Are there major medical providers that are getting involved in that space and really understand sort of where an emerging industry is headed? Now there's tracking investment trends around any particular sector. A third might be understanding how you compare to your competitors in terms of companies that have been invested in or acquired. So we certainly look at the AI space which is obviously very active right now and understand where is Google placing their bets in this space, in terms of other emerging companies versus Microsoft versus Facebook versus Apple. Turns out they have very different profiles and you can see that visually across a Quid network.

On the narrative side of things, there's a couple of examples as well. We can look at the public narrative in the press around any topic. It could be a broad topic like sugar, you know, people being anti sugar, or maybe less so now and who are they blaming. Are they blaming parents? Are they blaming companies? If they're blaming companies, are they blaming food manufacturers, candy companies? Is it cereal manufacturers since the breakfast, or is it fast food, or is it soda with all the talk about soda tax. We've certainly had companies come to us and look at it and sort of figure out who seems to be getting most of the blame around this. And if you're one of those companies it's time to panic. If you're not the question is how do you then act given that information. So it could be specific brands, it can also be these larger topics. We also can look at the voice of the customer, whoever that customer is, what are they saying when they're on review sites around the hotels they're staying in or the shoes that they're wearing. Or it might be somewhere like in a public forum. We have looked for example in depth at form entries from people who suffer from diabetes and are trying to obviously manage their condition. And you can learn all sorts of things, you can learn what they're frustrated with on a daily basis, what gets them excited, where they feel they are triumphing over this condition or at least have small wins along the way. You can understand their routine in terms of how they manage their diet, what they allow themselves to do versus not to, how they manage administering insulin shots and taking medications.

And then of course woven into all of that are the companies that help people manage their diabetes. So you've got all of the pharma companies, you've got the folks who make insulin pumps and all of the equipment involved. And you can understand where do they fit into that story as patients are logging in every day and talking about managing their condition.

Sandra: How are you helping organisations grow or innovate or manage risk?

Dan: We're helping companies to grow and manage risk in a couple of ways. One is keeping track of your competitors. We actually worked with the car company Hyundai to really try to break down is Apple developing a car? And if they are, what can we learn about what we think they're developing. And so we talked earlier about the types of data sets. This is one where you’ve got to put a lot of that together. Right. So we looked in the news and on blogs all of the rumours and hearsay that had been written about the potential type of car company that they might be building. And when you look at its scale in Quid you see which things form large clusters which are rumours that are probably at least worth paying attention to. We of course don't know exactly that they're true or not. That was one data set we looked at. Then we looked at all of the hiring information we could find about who Apple had hired over the last few months. What were their backgrounds? Some of them came from car companies themselves. Some came from navigation companies, some came from safety companies or transportation, for example. So that was another data set. Then we looked at a data set of the companies that Apple had invested in recently or acquired. And of course with some of them you could see pieces that seemed to be focused on materials that are often associated with cars. Again none of this can we completely confirm but you start to put together all these data points. And then a fourth data set that we looked at was all the patents that Apple had filed for recently because that's public information. So between all of these pieces you can put them together and start to get really what I thought was probably the most comprehensive view I've seen yet of what Apple is likely up to with developing a car, and of course we'll find out at some point here how great we were and that will be very interesting. It can give you a huge amount of intelligence around what a competitor either is up to, or in this case might be up to.

Sandra: It's going to be fantastic figuring out whether you were right or not.

Dan: Exactly. Another example of managing risk and staying on top of things. We have another client we work with very closely who is tracking about 12 different topics and they're tracking them consistently, perpetually. And these are topics that America is always talking about. And opinions shift over time and each one of these affects our client tremendously. One of them is around the minimum wage and times people think the minimum wage is going to limit business. And at other times they think it's a fair living wage for folks and of course some people are always on one side. But there are opinions in the middle of that shift and it's going to affect this company very much so they're looking at the public narrative around that. Another one is around organic foods in GMOs, right. And how that conversation is trending. A third is around guns, gun ownership, gun safety etc. Another one is around immigration which especially in a political season has shifted all over the place. So Quid allows them to see these topics at scale and then on top that they can figure out ‘where's my brand being pulled into the conversation, if at all?’. And are they being pulled in only one side or on both sides. What's the sentiment toward my brand as it pertains to this issue? And so really it becomes a crucial piece of the risk analysis and risk management team to understand might we be headed towards a PR crisis or is there something we can do along the way to stay on the right side of this issue.

Sandra: Speaking of the public narrative has Quid looked at the Trump presidency and the Trump election?

Dan: Yes. We look at politics quite a bit because as I said earlier narrative analysis is one of the strengths of Quid and politics is nothing if not an evolving narrative over time. So two examples I'll give you: one is during the heat of the election, really down the home stretch, clearly the US presidential election was marred by controversy after controversy. It seemed like every day or week there was a new controversy. And so NPR, the radio station, actually came to us and asked if there was any way we could predict what was likely to happen in the final month of the presidential campaign in terms of controversies both for Trump and for Clinton what was likely to happen. And their bet was that Trump especially was unpredictable because he seemed unpredictable from week to week. What we did is we took news of every controversy that had hit both Trump and Clinton since the minute they'd announced their candidacies. And we looked at them in Quid and then we were able to extrapolate and turn them into a pattern in terms of did that issue die quickly? Did it stay for a long, long period of time? Or did it go away and come back again in a series of waves? And there were examples of each of these. And what we found – and this surprised us as well – there actually were only six patterns to controversies over the 93 controversies that we identified. They all fell into six patterns and then we could figure out with what frequency did each of these patterns occur. It's different for Trump than it was for Hillary. Hillary actually only had three by the way of those six patterns but the ones she had stuck around for a long time. We can think of some obvious examples like Benghazi and e-mail that just would not go away. Trump, being the master of controversy, had all six and had quite a few of each of them. But interestingly a lot of them would go away. We think largely because the next one would actually fall up in its place. We were able though to predict in the last month what types of controversies were going to come up and how often and so NPR actually turned it into a bit of a game where they put us up against to quote "human pundits" one on the left conservative and a AI on the right. And each of us had to predict the five controversies for both Clinton and Trump that we thought were going to happen over the course of the final month of the campaign and turned it into sort of a fun game.

Sandra: So how well did Quid perform?

Dan: So Quid beat the humans. Sorry humans to report that. What was very interesting about it is in the first week or 10 days it was actually very close because any campaign has quite a bit of unpredictability to it especially in a short term. But what we found is as the month wore on Quid was more and more right. Those patterns certainly did show up again. Honestly we were panicking a little bit in the first 10 days. It was very close. Given that this is our bread and butter in terms of what we do, we didn't want to lose. But what was interesting is we kept tracking in the last three weeks of the campaign after the radio story aired. And that gap just kept widening and widening and it really showed us in vivid detail that really with more and more time these things become even more predictable. Why? Because we get to collect more and more data and so we have an advantage that humans don't, they just can't process those millions and millions of data points and so they apply their biases we get to just look at the patterns. And it was interesting a few times we were actually very tempted to change our picks because they did not feel intuitive and yet we didn't, and sure enough that pattern came back and proved us right.

Sandra: How are you looking at Trump today?

Dan: We'll continue to look at the political landscape in general and Trump specifically. We actually have something on our blog today a partnership that we've done with a customer survey company called Dscout and it's a project that we called the Trump Diaries. What they have been doing from the minute the election was over is following people who voted for Trump and people who did not vote for Trump. That means that they voted for Clinton or Jill Stein or Gary Johnson or one of the third party candidates and they've asked the same questions. How are you feeling about Donald Trump today? And how are you feeling about the state of the United States today? They asked that right after the election and they've been asking it in an ongoing way since then and people have been recording their thoughts. They've actually been doing video recordings of them and talking about how they feel on any given day about this. And what they've done with that data then is uploaded it into Quid. That's a perfect example of what you can put into Quid and we can look at all of these individual pieces from over a thousand people at scale. And what we have found in this period is that basically so far it looks like Trump's transition period was a series of flash points which is very similar to how we talked about the controversies before. He goes from controversy to controversy and his followers get emboldened by that and love him even more. And his detractors of course are even more despondent at those times. But the people in the middle are what are interesting and how they wax and wane over time and at times they're feeling more positive about him and other times they're feeling more negative. Almost what was the undecided vote kind of before the vote happened.

Dan: Probably the biggest insight we have found in this piece so far is as it got closer to Inauguration Day it was around his Twitter use and his tweets specifically that was starting to gain heavier and heavier percentage of the negative, so the negative has become very focused around his use of Twitter. And it will be interesting to see if that continues or if it changes. But the Trump Diaries is a project that is going to continue into his presidency, and it will be interesting to watch how this same set of Americans basically how their opinions change or don't over time.

Sandra: Will be very interesting to follow this and thank you for your insights today.

Dan: Happy to be a part.

Outro: You've been listening to Sydney Business Insights the University of Sydney Business School podcast about the future of business. You can subscribe to our podcast on iTunes, SoundCloud or wherever you get your podcast and you can visit us at sbi.sydney.edu.au and hear our entire podcast archive, read articles and watch video content that explore the future of business.