Getting Linked In to Data Science with Dr. Igor Perisic


Fundamentally, at the core of
everything, you want an interaction to be very natural. Whether the design is superb, and it fits
exactly the way that you would anticipate it, it just feels natural. And that’s exactly where our field comes
into play. How do you make that thing natural? It’s not just a design perspective, but
it’s also, what is the content that you’re showing? How are you going there? So, it’s transforming that experience, instead
of just making it better. -You’re listening to the Microsoft
Research Podcast, a show that brings you closer to the cutting edge of technology research,
and the scientists behind it. I’m your host, Gretchen Huizinga. Big data is a big deal, and if you follow
the popular technical press, you’ll have heard all the metaphors: data is the new oil,
the new bacon, the new currency, the new electricity. It’s even been called the new black. While data may not actually be any of these
things, we can say this: in today’s networked world, data is increasingly valuable, and
it is essential to research, both basic and applied. Today, we welcome a special guest to the podcast. Dr. Igor Perisic is the Vice President of
Engineering and Chief Data Officer at LinkedIn, the social network for business and employment. Today, Dr. Perisic talks about the key attributes
of a data scientist, how AI and machine learning are helping personalize member experiences,
why we should all be big open source fans, and how LinkedIn is partnering with other
researchers through their innovative Economic Graph program to “create economic opportunity
for every member of the global workforce.” That and much more on this episode of the
Microsoft Research Podcast. Igor Perisic, it’s great to have you
here on the podcast joining us via Skype from Mountainview. Welcome. -Thank you very much. -Yeah. Igor, you’re the VP of Engineering at LinkedIn,
and you’re also the Chief Data Officer. So, in broad strokes, give us an overview
of what you do, and define your responsibilities in each role. -So, I’m an engineer. I build things. I build the data systems, the infrastructure,
that powers the back end of LinkedIn, and the offline systems upon which we can manipulate
our data. And I’m a scientist in the sense that I
build also the algorithms and the personalizations that we can provide to our members using those
systems, to make the experience better for members. So, the Chief Data Officer is bridging these
two, and has a little bit of a component around policy, in the sense of making sure that on
one side, the security, engineering and legal are talking together to address the problems
that can arise from there. -LinkedIn has been both a thought leader
and early adopter in the field of data science. Talk a little bit about that field in general,
and, more specifically, its history and current role at LinkedIn. -Well, so history. we can go far, far back. I mean, the terms “data” and “science”
have been defined, let’s say, in the English dictionary, somewhere around the 17th century. But fundamentally, I think it had another
change of direction somewhere within the last ten years. It became somewhere around 2007 or 8-ish,
which we worked at LinkedIn, and D.J. Patil, who was my product counterpart at that
time, somewhat coined a term within the, let’s say, more recent generation of it, which was
more about an individual who would really do five things: who could hack code, create
code, who can reason around data and see the product that it has inherently within it,
and hack the product for it. An individual who was really good about machine
learning, so it can actually create those algorithms. An individual that could communicate. So, you can see the story. You can, maybe, write it up. But you can communicate that to somebody else,
and not just towards somebody who’s very knowledgeable about it. And a great engineer who would build infrastructure
to it. Because you’ve got to remember that 5 or
6, 7 years ago, the infrastructure wasn’t there. So, you have to build everything. Today, I see it migrating back more towards
the ability to create the story, the ability to see the pattern through the data, and then
how it can actually help a product. -And the term “data scientist” is
also a very connotative term right now. -Yeah, I think originally it
was because we created something that was different. And we gave it those two terms. Then, later on, as with everything, something
becomes very hot and sexy, as a career. So, then everything becomes “data science.” -Right. -So, we do the same thing as
we call it, a relevance engineer is the same thing as we call it, data scientist, or that
we would call it just researchers, research engineers, data engineers. They all migrate around the same topics, so
certainly. -Now, you just mentioned a term, “relevance,”
which is a term I’ve seen along and around your website. What does “relevance” mean? -Actually, that’s a good question. It’s certainly something that’s very specific
at LinkedIn, where we move towards being much more data-informed in the way that we build
products and we take decisions through time. And then bringing all this optimization, techniques…
we didn’t want to call it like optimizers, optimize this, optimize that… but making
things more relevant. And that’s where we sort of keyed around
the term “relevance,” as the science to make things more relevant, which is fundamentally
to make something more personal to an individual. -I love that. And I think that resonates with a lot of people. So, let’s move over to the mission of LinkedIn
and it’s, to quote, “giving economic opportunity to every member of the global workforce.” So, tell us what role data science and research
play in making that mission happen. -Well, we do have a fundamental
role at the core, because when you create economic opportunity for every member of the
global workforce, you’re creating that economic opportunity for an individual. And that individual is always unique. Certainly, from, let’s say, a statistical
perspective, it’s part of a bigger bucket. But at the core, that opportunity is unique
for that individual. And in that situation, it means that you need
to actually build up your products or build up your experiences to that individual, and
to tailor it to him or her. And that’s exactly where our field comes
into play. You can create broad-scope experiences, but
when it starts getting into this personalization aspect, that’s where, exactly where machine
learning comes in, or AI. [music] -So, in a recent post on your engineering
blog last year, it was called “Celebrating Research Excellence in LinkedIn.” You gave a shout out to your researchers for
3 things. One of course is making your products better. But you also referred to contributing to the
open source community, and producing world-class research, peer-reviewed research. So, why is the work of your researchers important
to the scientific community at large, and vice versa, I would say? -[laughs] I grew up with an ideal
of science. And I, um, there was something that resonated
to me from a very, very long time. And I don’t know why. You know, you grow up and there’s some sentences
that stick with you. And one that really stick with me from very
early on was, “Standing on the shoulder of the giants.” And the reason for that is I could see the
broad scientific community as just one community. And however great you were, you were standing
on somebody’s shoulders. And sooner or later, somebody would stand
on your shoulders. So, it was never something that it was done
for yourself. It was something that was done for the greater
community and leveraging all that knowledge and moving it forward. So, then of course, within that scope, contributing
back is very critical and very important on two things: one, well, if you have the next
generation, the next individuals; and also, a measure of quality. You may think that what you’ve done is great,
but it may be completely wrong. So, once you expose it, it’s when you get
that feedback that you can actually learn. Being right or wrong is not necessarily fundamentally
the most valuable thing, compared to just the experience of learning. -So, there’s been always a bit of tension
in the research community between proponents of applied research and basic research,
especially in industry. What’s your philosophy about the role of
basic research in industry, where the ROI may be years out instead of immediate, and
what role does basic research play at LinkedIn, a company whose product research is basically
embedded? -So, I’m a big proponent of
research, whether it’s basic or applied. And it’s all a matter of context. I believe that if you only limit yourself
to just applied research, you tend to sort of focus into just one area, and just dig
in more and more and more and more. And you’re preventing yourself from actually
seeing this paradigm shift that can occur. Basic research has this goodness, or this
overall perspective of, that there’s something that’s extremely challenging, and there’s
something that advances our understanding, and in itself is valuable. The application will follow or will not follow,
but it will spark ideas. Within the industry, it’s actually really
hard to do just basic research. It’s just a matter of whether you have the
ability to do so. In the end, the majority of companies are,
need to generate a certain amount of revenue, so you have to be able to actually balance
the two. There, Microsoft, I think it was very, very
good at balancing the basic as well as applied research. At LinkedIn when we started, we were a startup,
so you can’t really do basic research or fundamental research and go to, let’s say,
the CEO at that time, in your group of 5 or 6, and work on something and, “Don’t worry
about it, in 200 years, it will make a difference.” Because you don’t know whether the company
is going to be there two or three years down the road. So, you have to play within in, but still
have a very good perspective and overall view that those two worlds are sort of intertwined. You can see the work that Microsoft has done
through the years. Fundamentally, you can say, “Well, at which
point of time does, let’s say, um, quantum computing become basic to applied?” Well, at the beginning, it was certainly basic. And today, it’s almost applied. Although, the window is very far ahead. -Yeah. -It’s not happening tomorrow,
but it is happening. -So, let’s talk about research cycle
for a second, specifically at LinkedIn. What does it look like time-wise and outcome-wise
for you, and how is it similar and different from maybe some other things? Because you have a really unique approach
to research at LinkedIn. -Yeah. Our cycles are much shorter. Like, for example, the thing that we’ve
published, I made it sure that, very early on, that publications would be about things
that we’ve developed and shipped on the site and had affected our members’ experiences,
compared to things that we’ve developed in an offline environment on some subqueries
and whatnot, and it moved some metric. It had to have gone all the way in. So, in this case, our cycles are different
in a sense that the end value is around whether it actually went all the way down to the site. Now, tying into what you said, there’s also
another cycle. It’s, how long do you research before you
make a difference? And there, our window at LinkedIn is I think
most often, except for infrastructure, where it takes time to develop, it’s probably
maybe 1 or 2 years out. So, it’s certainly not Microsoft Research. -Right. So, what’s the relationship between the
researchers at, say, Microsoft Research now, and LinkedIn, now that you’re sort of part
of the same family? -Well, it’s in multiple dimensions. One is, in some places, we were, let’s say
for example, the neural networks, and using CNNs and GPUs, Microsoft was far ahead of
the curve, and then we’re benefiting from the learnings from it. For example, how to set up the topology of
our clusters, how do we set up the topology of our, let’s say, rankers for it. So, then we’re benefiting from knowledge
that somebody else has done, if you want. And there’s also another side as well. We have a different perspective at times,
and then it’s just sharing it, as I mentioned a couple of times here, that science advances,
or research advances, when you bridge those different types of connections with different
types of fields. So, we bring another perspective. We’ve been interacting very closely with
the New England team, who has been our, let’s say, partner and front door to the rest of
Microsoft Research. As you can expect, there’s lots of interest
for Microsoft Research. It’s good that it’s a little bit centralized,
so we can actually navigate around all those expectations and demands. [music] -So, you talk about the importance of
what you call “conversations” with your members. And there are multiple, multi-level conversations
going on at any given time. So, from a technical point of view, how do
you stay on top of those and make sense of those conversations in a meaningful way? -I think it’s a new perspective
that I’ve looked at it, and you tend to think that once you’ve found out this new
paradigm, the way that you’re going to communicate about it, it makes sense, and you wonder why
you didn’t see it before. And probably in 5 to 10 years, I would look
at myself and what a complete idiot I was. But that’s the nature of things. Trying to find a model that can help you reason
about a problem that you have. In this situation, it’s about conversations
in a sense that very early on, we try to do things with our members,
get our members to do things. And in this case, a conversation is really
an informal chat, if you want, between two individuals, and it’s between with let’s
say LinkedIn and the member. And early one, what we’ve done is say,
well, we just blasted things. We just sent you emails like there’s no
tomorrow. We did very good optimization saying that,
well, the more email you get, the more likely you’re going to do something. Which is probably wrong. -Not me. -Uh. My point. But apparently, somebody had done a study,
and that seemed to be good. Of course, when you look at high-level statistics,
it seems to be fine. But the conversation was more like me screaming
at you. And then when you start shifting by saying,
well, there’s a lot of conversation that we can have, so which one should I communicate
to you, like, right now? And today, or in the recent past, we’ve
moved more into thinking about, well, this conversation doesn’t, don’t stop just
with the action. That there’s some sense of, what are you
doing at LinkedIn, and what do you want to achieve at LinkedIn? And those goals don’t stop tomorrow, don’t
stop on the click, don’t stop on the view, don’t stop on the share. They go for a longer period of time. And the problem then starts shifting, because
very often the techniques would be, I need to find, let’s say, a way to – use some
different techniques – to figure out what is your likelihood to act on something? It’s a probability. And in that case, whether you do logistic
regressions, whether you do different types of all the way up to uh, neural nets, you
come up with a number. And that number, and then you optimize around
it. And it’s usually just about that activity. Now, if you view it in the context of a conversation,
that activity doesn’t stop right there. It continues. We’re having a dialogue. It’s going to go on for a little while. You can’t just optimize for the next step. So how do you go in and out of that? -Right. -And I felt that it’s the right
time to think about it. Although, we had thought about it already
some years ago, simply because of the shift that we’re seeing nowadays with more and
more of just voice-driven interfaces appearing in a lot of places, which becomes another
way to communicate with an individual. Like, in the beginning, it was email, then
it was notification, then you have mobile pushes, or pull downs, then you have Windows
tiles on the desktop. And it becomes more and more natural, like
it’s voice-driven. And if it’s voice-driven, then that, it
is becoming a real dialogue. So, once you have that interface that you
can leverage to build up your application, how are you thinking about your optimizations? If you’re thinking them still in the logistic
regression work, it’s not going to work. -What does work? -Well, we’ve taken some early
steps a couple of years ago to look more around the quadratic constraints and quadratic programming. It worked for us, that first item, across
multiple different types of dialogues that we’re having with members, and they work
at, let’s say within a couple of steps. Overall, where that is all going is still
research. Just one or two steps ahead. -Right. -And we’re making good progress. And it’s going to actually be very, very, interesting
to see how those things are actually shifting. -Yeah. Well, it’s an exciting time with so much
research in machine learning, and people trying things, to see how it impacts both the technology
and the people that they’re working with. So, you said at one point that machine learning
actually helps transform, not just inform, but transform your interactions with your
members. Is that what you’re talking about here? -Fundamentally, at the core of
everything, you want an interaction to be very natural. Whether the design is superb, and it fits
exactly the way that you would anticipate it, it just feels natural. And that’s exactly where our field comes
into play. How do you make that thing natural? It’s not just a design perspective, but
it’s also, what is the content that you’re showing? How are you going there? So, it’s transforming that experience, instead
of just making it better. -Right. So, I’m a LinkedIn member myself. -Thank you. -Yeah. And I got to tell you, I have noticed, over
the last year, a difference in how I’m getting notified. And one thing I’ve noticed is that they
seem more personal to me? Like, I would get a notification about a person
that I know, or care about, as opposed to just feeling like you’re trying to pull
me into the app, right? -So, that’s exactly what we
worked over the last, I would say two years, to make it exactly that, compared to me pinging
you, “Hey, come back, come back, come back.” I don’t know why you would come back,
but here it is. Here’s like 2,000 reasons for you to come
back, pick one… To more, hey, here’s something of value
within the context of your interest of LinkedIn, or your value propositions we believe are
important. Here’s something that hey, on one side,
you ought to know, it would be a good thing, and on the other hand, it’s to share within
the network also. -Right. So, I have to say, that makes me happy that
the kinds of technologies you’re working on, especially machine-learning technologies
that might be helping to broaden your understanding of what I like, what I’m interested in,
is actually playing out in the real world for people like me. And I tend to be skeptical and cynical of,
you know, high-tech notifications. I don’t like those red notifications on
my phone, but anyway… -Sure. It’s kind of interesting when you pick up
the red thing. We associated red with danger. And it never occurred to me
why would it be red. Maybe it’s saying like, oh, you need to
do something like right now. But we have so many of them. And that’s why I think we’re ready for
that next iteration, that other change. I look at my mailbox, and it has like
numbers in the hundreds. From time to time I clear it out. I look at my phone, and it’s interesting
to see how some interfaces have decided to keep the number low, and others to keep the
number high. And to me, it seems that the ones that keep
the number high is those that are still screaming at you, and the ones that keep the number
low is to say like, we understand that you have a busy life. We understand that you have a lot of things
and a lot of applications on your phone. Let’s make sure that if I put a number out
there, it’s something that would be really good for you to know, compared to just,
“Come back, come back, come back!” -And that’s a fine line, right? Because you’re competing for people’s
attention in an increasingly crowded world. And where do you, you know, stop ALL CAPS,
and whisper to…? -But it seems to be working,
right? -Yeah, for me…. -No, but we can see it also. Of course, everybody is, to some extent, informed
by the metrics of the product performances. And we see that change. -Yeah. -Originally, people would be
confused about, “Hey, why LinkedIn?” And now it becomes more obvious. And then it becomes more natural, because
you actually steer the conversation to the right places. -Right. Which is interesting. This whole topic of conversation that we’re
on right now is talking about how technology is actually making me feel more personal towards
a particular product. Which is somewhat, sort of, backwards. But I like it. It’s working. Good job. -Data science, the terms themselves,
were not “glued together” to define what we’re doing less than 7, 8 years ago,
let’s say. Granted, there are some others that you can, that were sort of anticipating
that it was going to go there. But it’s similar with this sense of personalization
or this sense of, it makes sense within the scope of what I do, or who it is. And in the past, we used to call it an
augmentation of yourself. -Oh, yeah. -An agent that works, that extends
and works for you. And it fits naturally within what you do. And similar to data science back then, people
would look at you blank and say, “What are you talking about? What does it mean?” And today, you can start seeing it. You say, well, let’s bring all these AI,
let’s bring all this technology to make me quote-unquote better. Better as an understanding of what’s happening
around me. Better in the way that I can connect, interact
with it, augment the way that I can deal with the problems that I need to do at work, and
make me better at that. [music] -Most of the researchers that I’ve
talked to on this podcast are big “open source” fans, and I think you are too. You’ve published articles, given talks,
and you’re on the open source council, or you oversee the open source council at LinkedIn. Why is open source a good thing, especially
for researchers? -Well, I started by saying that
I love science because we all stand on the shoulder of giants. But open source is certainly a movement that
has just accelerated. And I have to say that in the beginning that,
everybody was wondering, is it going to stick for real? Is it going to stay forever? But it just fit so naturally, the way that
we develop things as developers. If you look at any company that has more than
two people, eventually there’s some level of abstraction that have built, and people
just leverage the code of each other. And open source, it’s just a further extension
to reaching out to the entire community. I was always a big proponent, even prior to
starting at LinkedIn, because we were just leveraging solutions, and every other individuals
had developed. For example, at LinkedIn, the search engine,
originally, the core of it was Lucene, an Apache open source project. Then we felt like, well, since we’re riding
on those giants, we should contribute back. And some other individuals will create businesses
around it, and that creates a whole ecosystem that becomes that much, much better. Then through it, we stumbled upon truths,
some of them being that you actually write a much better code when you actually think
about open sourcing it, compared to when it’s kept internally. And not only do you write better code, but
you have a better way to reason about it. You look at it, does it make sense, does it
not make sense? Does it complement what is out there, or does
it not complement what is out there? And a lot of goodness comes through that. -So, when we talked before, you mentioned
open source code as similar to a peer-reviewed research paper that you’re publishing in
some way. Can you explain a little bit about what you
think about that? -I usually take two allegories
about open source. One, is it feels like peer review, because
in a sense, you’re writing a piece of software, you’re documenting it, and you’re pushing
it out there. But there’s so many publications. And if your paper is not good, nobody is going
to actually build other things from it. You’re not going to be cited much. And open source is the same. If you’re not, investing into your product
to make it great, nobody’s going to actually really leverage from it. And I used to take another one, which is I
guess through my life experience. Open sourcing is a little baby also. You cherish it. It grows. But then at some point of time, you’ve got
to let it have its own life. And it’s like your kid that grew up and
now becomes mature. And they’re going to do something that you
may not have envisioned, which is perfect. And open sourcing is the same. You sometimes have the tendency to keep that
kid at home. I’d say, “Well, no, let them discover
the rest of the world.” -Igor, I just open sourced my daughter
at the University of Washington. -[laughs] Congratulations. -I totally get what you mean. [laughs] Let’s talk about LinkedIn’s relationship
with the larger academic and research community through public/private research partnerships. You have a program called The Economic Graph. And it started I think as a challenge, but
now it’s an ongoing program. Give us an overview of what it is and how
it came about, and what research you focus on. -So, the Economic Graph challenge. The idea of letting external individuals from
LinkedIn look at our data and answer some extremely pertinent and relevant questions,
was there from the time that I joined LinkedIn, and probably even before that. So, how do we create an environment where
we preserve that privacy of the individual, and open it up to the community? That was what the Economic Graph challenge was. We basically reached out to the, to global
community, friends and family, and others, to say, “Well, if you had LinkedIn’s data,
what are the questions that you would want to answer?” And within that, we had more than 200 proposals
that came back. We selected, uh, 12. The ideas were just brilliant. They ranged from, how do you think about micro
industries and different environments from a geo perspective, like the fashion industry
in Milan, or, let’s say, the rise of electric cars in the valley, for example, to, how do
women define themselves, or write about themselves on LinkedIn compared to men, if you control
for multiple factors? But again, the resources are limited, so you
reach out to the community to go…. -So, the people that participate in this
with you, the academics or researchers, come from all different walks of the research life,
and they participate by presenting a proposal and getting access to some of the data from LinkedIn? Is that how that works? -Well, originally, we, as I mentioned, we just
broad-casted and we got a lot of proposals back. And then we would evaluate on the idea, the
interestingness of the idea, but also the ability to execute on that idea. Meaning that you have the ability to, using
our data, answer the question. -Right. -Most ended up by being from
academia. -What’s your overall goal with Economic Graph? To create economic opportunity
for every member of the global workforce. And that’s where I got to believe a bit
more into the vision and mission of LinkedIn, to create that economic opportunity. And not only “an” economic opportunity,
but the right one for you; the one that you aspire to. So, the way that we used the Economic Graph,
on one side, it still needs to be descriptive, because it provides the value
of the labor market. It provides an image, whether it’s instantaneous,
or whether it’s in the past. It provides a sense of, “What are people
doing in the labor market, and what are the movements within it?” It’s a living entity, and
we provide a picture of it. And there’s tremendous value to it, because
the timelines are very quick. You don’t have to wait 6 months, or 3 months,
to get a government statistic, but it provides a good picture, so then that
is extremely valuable. -Yeah. -The other is, to understand
and see that careers are also living, they migrate, they shift from one domain to another. You have multiple careers in your lifetime. And understanding what it takes and what it
doesn’t take, and helping all members to actually build up their careers. So, that’s where the Economic Graph is also
moving, to figure it out, how do I share that information with you? How do I help you nurture your career? And in the end, well, that’s to create economic
opportunity and get you along your life cycle. Everybody, I believe, starts a job thinking
that it’s going to be the job they’re going to do forever. I started LinkedIn thinking that, “Ehh,
2 to 3 years of my life,” and 10 years later, I’m still here. But… and actually, I’m still here because
of the vision and the belief of the ability that we can move it forward. But once you think in that route, you’re
seeing the fluidity of the labor market as being something that is important. And I wasn’t an economist before coming
to LinkedIn, whatsoever. But there’s some very fascinating mind-frame
around it, or frameworks to think about it. [music] -You’re a researcher. You’ve ended up parlaying a research interest
into a career. What advice would you give researchers who
were heading into the field of maybe data science, or even more generally, as they prepare
for their next steps after grad school? -The main one, be curious. Continue to be curious. There’s a sense of, if you went into getting
a PhD, you’re wanting to solve a problem, there’s something that was missing, you
went at it, and you dig in, and you attempted to move the answer a bit further along…
that’s great. It’s certainly one of the best experiences. On the other hand, make sure that once that
happens, don’t just limit the focus. I think that’s, at times, what researchers
are missing, that, yes, it’s valuable to dig in, very much, but be aware of the rest,
and make connections from that to maybe better what you’re doing, see different paradigms,
different perspectives. I was listening to Fabiola Gianotti just recently,
who is the head of the CERN. She studied first humanities before moving
into physics. And I view myself as a scientist also. And the things that I value more, that I wish
I had more going to school, is actually humanities. Not math or the rest, because I feel like,
no, I get these anyway. So, that’s a given. But the rest if missing. And if you think about it this way, today
at LinkedIn, I’m trying to understand, from a member’s perspective, what are the struggles
that they’re going through in order to make their career better? And you get that more by this site exposure,
by this interest. Make these connections. And these connections have to be outside of
just your domain. If I had one encouragement, it’s keep the
wonder. Just go for it. Just keep the wonder and be willing to learn
different topics. And challenge yourself a lot. -Igor Perisic, thank you so much. It’s been a delight to talk to you today. -You’re welcome very much.
Thank you. [music] To learn more about how researchers are
harnessing and using data to make the digital world – and your experience with it – better, visit Microsoft.com/research

, , , , , , , , , ,

Post navigation

Leave a Reply

Your email address will not be published. Required fields are marked *