Stefan Weitz is a Director of Search at Microsoft and is charged with working with people and organizations across the industry to promote and improve Search technologies. While focused on Microsoft’s product line, he works across the industry to understand searcher behavior and in his role as an evangelist for Search, gathers and distills feedback to drive product improvements. Prior to Search, Stefan led the strategy to develop the next generation MSN portal platform and developed Microsoft’s muni WiFi strategy, leading the charge to blanket free WiFi access across metropolitan cities. A 13-year Microsoft veteran, he has worked in various groups including Windows Server, Security, and IT. Stefan is a huge gadget “junkie” and can often be found in electronics shops across the world looking for the elusive perfect piece of tech. You can follow Stefan on Twitter.
- Initially when Bing launched social search, they wanted to carve out a distinct space for the social results. Later on it became clear that these worlds were blending together and it made less and less sense to keep them in a separate space.
- Bing is now indexing 30 times more data from Facebook than they had previously. On average, people will see about 5 times more results than before.
- While Bing is doing a much better job of harnessing user’s relevant friend information, they are also focusing on relevant “expert” information as well; influential bloggers, subject matter experts…
- Even though search and social results are blending, they are still kept separate because really, how can anyone decide which of those to rank more highly?
- The notion of a Like is still a little bit perplexing from a ranking perspective. What does a Like mean for a page? Does the user like the design, the content, or maybe just the picture? Bing tends not to just use a pure Like signal to do ranking.
- Shares are basically the same as Likes – not used a ton for web ranking except in velocity (like the way Twitter is used for discovering news).
- It’s an uncharted territory as far as what are the best types of queries for social search. It may be that in social search every query should have a person as an answer. Even something like, “what’s the height of Mount Everest,” a very definitive, objective query should have human results.
- Bing’s social search has combined together four different services and applied a layer of machine intelligence on top and applied a layer of semantic knowledge on top of that to deliver that one result; something no one else is doing right now.
- When someone changes privacy settings or deletes a post from Facebook, Bing gets that update in real time. The result is then purged from their results in minutes not hours or days.
- The social pieces in the Facebook experience were all developed by Facebook. Bing uses their own algorithm on the social search data for their social search results. It is completely independent of what Facebook does with Graph Search, even though it operates on the same data set.
- When you search on Bing, it gives you the web results plus all the different updates that come from the Facebook social graph. On Facebook it really pivots more around the person and their interests.
Full Interview Transcript
Eric Enge: Let’s talk about Bing and Social Search!
Stefan: Initially when we launched social search, we really wanted to carve out a distinct space for the social results. That was done partially from a user experience standpoint to identify the fact that we think social results are often very different than web results. The web results are what the web knows about your query; the social results are what people know about your query.
As we really got into it, it became clear that a lot of times these worlds were blending together and it made less and less sense to actually keep them separated off in that carved-out space. They are still separate in the new experience, but it’s much more in line with the overall experience than it was before.
Let me show you what that looks like. If I try something simple, like Hawaii, what we get are the web results on the left-hand side. In the middle, you get our snapshot, which pulls in data and services from across the web. You can see people who were born in Hawaii, who their governor is, celebrities who are from there, all sorts of different things.
But where it starts to get really fun is here on the right-hand side. You see a post from my friend Geoffrey on Facebook from a couple days ago. I can see Bob Bennison, the Chicago guy who’s often talking about something political. I can scroll down and see lots of different updates from all of my friends on Facebook that talk about Hawaii.
Until recently, we only showed you what my friends Liked, and if they were from there. Now, we’re indexing much more data from Facebook. This includes relevant status updates, comments and photos.
We still get Like and Share data, but because we now get the update and comment data, we amped up the amount of data that we index from Facebook by about a factor of 30, people will see, on average, about five times as many results as they actually would have seen before in the previous experience.
Eric Enge: So previously you were just getting Like and Share data, and now you’re getting a whole lot more?
Stefan: Yes, we were just getting Like and Share data, and some profile data; where you were born, where do you work, those types of things. Now, it’s doing a much better job of using the social footprint that people leave across the web.
While we are doing a much better job of getting users relevant information from their friends, we know they’re not the only piece of the puzzle. The other area that we have focused on, what we termed in the old version, are the “experts.” We’re looking across Klout, Google+, blogs, Twitter, all sorts of social mediums. We’re actually pulling in experts about Hawaii. So here is Ryan Ozawa, who has a 69 Klout score about Hawaii. This is ‘Hawaii Five-0’, the TV show. Here’s Sara Benson, who’s written an Oahu guide and posted it on Google+. Here’s another Hawaii Twitter account.
So now you can see that we’re really scanning across all ranges of social networks where people are leaving their social footprints. We’re breaking down the silos and bringing them all together into that right rail. So it is definitely a big change from what we had before.
Another example we can use is my daughter’s love of Maroon 5. Here on the right, we’ve got the social results.
I’ve got my buddy John who shared a video of the band. There’s also a photo of my daughter with an autographed Maroon 5 guitar that my sister posted. (Somehow Santa knew to bring her the guitar even though she doesn’t play, but she’s going to learn.) Because my sister shared this on Facebook, and the update has the word ‘Maroon 5’ in it and because of my relationship with my sister on Facebook, I’m actually able to see that photo right here in the experience. Pretty slick.
I can see other friends as well. My friend Whitney posted a photo and a comment about Maroon 5. And as I mentioned before, I can scroll down and see their official social networking accounts. Here’s Adam, who’s the lead singer, and here are some questions and answers about Maroon 5. Again, right there inside the experience.
So as you can see, these are all things that never would have made it to the top of a search page, no matter how personalized it would be. Something like this, the official site, is likely to always take precedence over the picture of my daughter on the page. It demonstrates why we think it’s so important to not blend the results, because really, how can anyone decide which of those to rank more highly?
They’re really, fundamentally, two different ranking models. One is ranking using a static rank, or page rank; the left-hand side of the results. On the right side is a whole new notion of social rank;
They’re really, fundamentally, two different ranking models. One is ranking using a static rank, or page rank; the left-hand side of the results. On the right side is a whole new notion of social rank; how we can actually figure out, from the many networks that we have feeds from and scan, what of that data is likely to be the most important thing to show a user for that query.
Eric Enge: So, the web-related signals are not influencing the order of the social results, and the social signals are not influencing the order of the web results?
Stefan: In this experience, that’s absolutely true. That being said, we are looking at social signals to annotate these pages, and honestly, if something is getting a huge number of, say, Tweets, for example, that does factor in to the ranking as far as showing fresher content up top. But it’s very nuanced right now. Generally, those two are fairly separate.
You can see here, for example, that we’ve annotated the domain for Stubhub, showing a friend of mine who Liked Stubhub. It’s also showing me the list of shows. This will boost it in the results because of my location and the fact that Maroon 5 is coming to Seattle in about a month or so.
If I hover over the little thumb, I see my buddy who Likes this domain, and this result was actually boosted in the rank because of the fact that this is talking about a Seattle location, Key Arena. So that actually benefited from two different signals, one a social one and one a geospatial one.
Eric Enge: So what we see there is an example of a social signal causing an annotation and a local signal causing a ranking change.
Eric Enge: And something that we’ve all known for a while is that when there are bursts of activity in social media, a related query deserves a freshness impact. This causes something to be seen as news. However, Likes are not being used in a fashion similar to links to re-rank web search results.
The notion of a Like is still a little bit perplexing from a ranking perspective.
Stefan: Not now, no. The notion of a Like is still a little bit perplexing from a ranking perspective. What does a Like mean for a page? Does the user like the design, the content, or maybe just the picture? We tend to not just use a pure Like signal to do ranking. There may be some small boost. Over 1,000 signals are used in ranking, right?
Eric Enge: Yes. So the user’s intent behind a Like is pretty ambiguous.
Eric Enge: Sometimes it’s just telling the person that, hey, I read your article.
Eric Enge: I wonder if everybody in America was Liking their 10 favorite things they encountered every day if the Like could be a very rich signal, but I guess it really doesn’t work that way.
Stefan: If you scoped it that distinctly saying that you Like the things that you find interesting every day, that would help the signal. But today, we don’t even know why someone’s Liking something, which is why comments and those types of things are actually a little more interesting.
At least we can use some semantic parsing to see that one person loves Maroon 5, and that someone else dislikes Maroon 5. And that begins to allow us to actually do some more interesting work with the social signal beyond just the binary of Like.
Eric:What about Facebook Shares?
Shares are basically the same as likes – not used a ton for web ranking …
Stefan:Shares are basically the same as likes – not used a ton for web ranking except in velocity (like Twitter).
Eric Enge: Yes, but the key value with your new social search implementation is that you’ve got two modalities of searching, and you’re now allowing those to be presented simultaneously. While you were doing that before, but with the update and comment data your data set increased by 30 times.
Stefan: That’s right. It’s a huge number. It makes it really hard because now there is just so much more data and you can see how fast it gets returned. So it’s a very complex data set to run.
Eric Enge: What are the kinds of queries that will do well in social search? Where is its sweet spot?
Stefan: The types of queries that are going to do well in social search are ones that generally might not have definitive answers. They’re not objective queries. The tallest mountain in the world is clearly Mount Everest. That’s probably not the best candidate for a social search result.
But when you’re talking about queries like “where should I go in Seattle for good food,” or “what’s the best way to get from the Chicago Airport to the Gold Coast,” you are likely going to get the most from returning a person who might know. To go from O’Hare Airport all the way into the city, I can go Uber, I can get a regular taxi, I can rent a car, or I can take a train.
But there are a number of variables that factor into that decision. For example, when do I arrive? Well, if I arrive at midnight then getting a car into the city is not that big a deal because there’s not a lot of traffic. However, if I arrive at 4:00 PM, I would never take a car because it would take an hour and a half, whereas I could hop a train and be there in 20 minutes; so there’s that variable.
There’s also my price sensitivity. Maybe I’m OK to spend a lot or maybe I only want to spend a little on getting in. There are a number of variables that I have to take into account when it comes to getting from the airport to the city. A search result from a standard web algorithm search isn’t likely going to be able to capture and factor in all those variables.
A person, however, who say lives in Chicago or has posted photos of the airport for example or likes a particular car service on the web, might be seen as a person that I know who is likely able to have some insight into that particular question.
Eric Enge: You can also ask follow-up questions, right? The price sensitivity, or the time of day would probably not be specified in an initial search query.
Stefan: Right. Let’s say that I come from a smaller town where the rush hour is really almost non-existent. Let’s use Spokane, Washington as an example, where rush hour is from about 5:00 to 5:05. Because I’m coming from a small town I may never think to ask about time of day whereas if I find somebody who lives in Chicago and I ask them, “Hey, how should I get downtown from the airport?” The first question they’re likely going to ask you is what time you getting there because they know that’s a real factor in that decision making.
It’s honestly an uncharted territory as far as what are the best types of queries for social search.
Stefan: It’s honestly an uncharted territory as far as what are the best types of queries for social search. It may be that in social search that every query should have a person as an answer. So even something like “what’s the height of Mount Everest,” a very definitive, objective query should have human results. I have a friend named Jeff who has actually climbed Everest a few times, so Jeff and his Everest photos might show up as a result of that query I’m doing in Bing.
I might even decide, “Well, shoot, I know Everest is the tallest mountain, but Jeff can actually give me some real detail as to just how tall it is.” Maybe he can tell me how little oxygen there is at that height or I could see pictures of what it looks like from the top of the mountain that he’s climbed a number of times. I’m often surprised by what actually happens when I do those queries and get those social results on the side.
Eric Enge: Can we look at some example queries?
Stefan: Sure, I’ll try “Seattle restaurant” in my co-workers account, Chris, and see what happens. Here’s a very good example of a query where there’s a lot of structured data and a lot of web results. Obviously a very generic kind of query.
Here on the right-hand side, I see an update from one of Chris’ friends that Nell’s is one of the best restaurants in Seattle you never heard of. I tend to agree. It’s a really good restaurant. But he’s actually posted a comment on Facebook linked to the restaurant right there.
If I scroll a little further down, I can see Rebecca who’s actually asked a question, “What are some good vegan restaurants in Seattle?” And what’s cool about this now is I can actually see there are four responses back to that. So I can click on that, and go and look at that.
I now have all these restaurants that that are vegan restaurants from people that have commented on that from all across Facebook.
I can message back and ask her, “Hey, which one did you choose and was it any good?” Something you just aren’t going to get even in the great web search experience that we have here on Bing.
Also, it’s not just my friends. We also talk a lot about the fact that we should be able to find information across all these different silos of social data whether it’s Quora or Twitter or Facebook or Google Plus, some blogs or LinkedIn. There are a number of these vertical social networks that allow us to build a better response to query using all those kind of footprints. Our approach to this is really unique.
We also see an influential food blogger in Seattle named Ann Lee. We know she is influential there are a number of things that we use to determine that. We look at what she posts and how often she writes, then we run a semantic algorithm against that to figure out that she’s influential about food.
I can hover here and see that that’s her Twitter account so that I can click on it and actually see it. These are actual blogs where she’s written about Home Grown Café, Darius Pie, all right here. I am very proud of this result. We also found her blog and pulled through all the reviews that she’s left about Seattle restaurants into this result from her blog.
This module alone has combined together four different services and applied a layer of machine intelligence on top and applied a layer of semantic knowledge on top of that to deliver that one result. That’s something no one else is doing right now and I just think it’s remarkable.
Eric Enge: Excellent. I’m going to guess that entertainment related searches work well, too.
Stefan: This is an interesting one. I did a search for Coldplay. On the right rail it looks like there are some comments from Facebook.
Eric Enge: I see in the social results someone who has been to a concert, which means you can find out from them whether or not they are as good in concert as they are on the records.
There are lots of bands who don’t do as well on stage as others. You’ve got people who are just known for being tremendous performers and so you can ask those kind of questions.
Stefan: I also like Sara Bareilles and I can see that she’s covered a song named Yellow which is an old Coldplay song that I really enjoy. This result of Sara Bareilles and Coldplay would never have been on page one of the web search results. These are things that I wouldn’t have otherwise discovered.
We also show Coldplay’s official Twitter account. We pull in the Klout scores so you know that is it really them. Bing is integrating all these different web services in a way that taps into the social graphs.
Eric Enge: That’s awesome stuff. I understand that people can make things not public in Facebook, but there is information that’s already out there that they published under an older set of rules. What are your thoughts on that?
Stefan: On our social sidebar what you’re seeing from your friends is only information or updates that have been granted permission via Facebook. Nothing is shared that is marked private on Facebook.
We know that privacy is a big deal when thinking about social data.
We know that privacy is a big deal when thinking about social data. We also comb through Twitter, which by definition is public unless you lock down your account. The same is true with blogs, obviously they’re public unless for some reason you’ve got a password on it.
We’re really just mining public social data.
We’re really just mining public social data. Now we get feeds in many cases to make it more efficient so we get the Twitter fire hose, for example, in real time so it’s much more efficient than say crawling. And we get other feeds from Klout for example. That said, the privacy angle is certainly something we’re always thinking about.
Eric Enge: I think people have identified concerns because of the content being exposed in a much more efficient manner. Being able to search this content makes it much easier to find this type of data. For example, I saw something that somebody wrote about a dissident group in China being found through Facebook. It probably could have been done before, but I guess the reality is that it was easier to find because of the search capability.
Stefan: You’re right. We’re certainly making it easier to tap into that social data set However, if someone’s post is public and it has surfaced here, the user can delete it. For example, if Lisa doesn’t like that this particular link surfaced in Facebook, it’s simple for her to go in and either delete it or change the privacy settings on that particular update.
We have that relationship with Facebook, when she changes privacy settings or when she deletes the post from Facebook, we get that update in real time. So the next time you run a query, you wouldn’t see it. Because of the privacy sensitivity that we have, it’s as real time as you can get. The result is then purged from our results in minutes not hours or days.
Eric Enge: I guess it’s incumbent on participants in social media networks to realize that it’s a public environment. If you write it, people can find it, and you need to understand that as you interact on these networks.
Stefan: I think people are getting more knowledgeable about privacy settings. I think they are becoming either more comfortable with things being shared or they’re being more careful about what they are sharing. This space is changing quickly and networks are rising and falling so quickly, that it’s tough for the average person to keep up on all this stuff. But I think if you are going to engage in it, then you need to make sure you understand both the benefits and the responsibilities that come with social data.
Eric Enge: Did Bing play a role in advising or working with Facebook on their Graph Search or did you focus primarily on integrating it into Bing itself?
Stefan: We’ve focused primarily on the latter and how we take that Social Graph data and make it part of Bing. The Facebook Graph Search was something they worked on by themselves. Part of the reason is that it’s a very different problem. When you think about what Facebook graphs are doing, it’s searching structured data. Let’s look at search for a user on Facebook. You can see what musicians that she listens to. Now we get to see all the musicians that see has actually liked across Facebook. Then you can refine that search.
If I am more curious about what movies she liked, I can search on that instead. In essence you can see what they’re really doing which is great. It is allowing you to tap into that structured set of data that they have about that account so they know that this account likes these certain things.
Or, you can search on Superman, the movie, which is, coming out soon.
If you want to go and do a web search, you can just go down here and boom, hit that and then you’re off to the races using Bing’s technology to actually conduct a web search. There are the web search results for Superman right inside of Facebook and this is all powered by Bing.
So that’s the Bing contribution, if you will, to Graph Search. But overall the social pieces in the Facebook experience were all developed by Facebook.
Eric Enge: Does Bing implement its own algorithm on the social data from Facebook for its social search results?
We use our own algorithm on the social search data for our social search results.
Stefan: We use our own algorithm on the social search data for our social search results. It is completely independent of what Facebook does with Graph Search, even though it operates on the same data set. We had to develop an entirely new method on the right rail to rank social search results. Things that have more updates, likes or comments tend to rank more highly for the same thing as ones that don’t.
Eric Enge: So you are doing your own ranking of the social search content?
Stefan: Yes, very different. Facebook is showing something totally different. When you search for Coldplay on Bing, we give you the web results plus all the different updates that come from the Facebook social graph. On Facebook it really pivots more around the person and their interests. They are two very different methods of retrieval.
Eric Enge: Thanks Stefan!
Stefan: You are most welcome!