How Graph Technology Is Changing Artificial Intelligence and Machine Learning

– Hello, everybody. Thank you for joining us today. My name’s Amy Hodler. I am the Program Manager for Analytics and AI at Neo4j. I’m here with my colleague, Jake Graham, who is the Product Manager for Artificial Intelligence and Analytics. And we’re gonna talk to you
the next 40 minutes about how graphs enhance
artificial intelligence, in particular about context,
and we’re also going to give you a few examples
to walk away with. And so I like to start with a little bit of a, a level set around machine intelligence in general. And so there’s a lotta
different terms out there that you’ve probably heard,
AI, machine learning, deep learning, a lot of
different definitions. But at the really high level, what we’re talking
about are different ways to solve problems, in
particular with AI, a way to solve problems and apply learning that mimics the way humans make decisions. And what we’re really trying to do is develop probabilistic methods to use software to make predictions. Now usually that’s either to
predict what comes next based on features or the information
we might have at hand, or it’s about predicting
how to classify things. Am I looking at a cat or a dog? Should this community be
classified this way or that? Are they similar or dissimilar
to another community? Now if we think about
intelligence in general, and especially if we’re trying to mimic the way humans
make decisions, context is really important for that. It’s really important for understanding, it’s really important for predictions. Now I have an example of doing that wrong, actually from my past,
developing a logistics solution for a company that shipped yogurt. We made the assumption that
FOB for a logistics solution of course would mean freight on board. It turned out, for this
company, FOB meant fruit on the bottom. And that was very significant to them. It meant for them that
those yogurts needed to go in different containers and
different shipping methods. And so our solution failed them even though we had done a very clever job, we’ve done a lotta due
duplication for that FOB classification, and we
hadn’t taken in the context of the current situation, we had only learned from our past. And if you think about
how humans learn, we, in general, maybe not
always, are pretty good about trying to understand
from our past and then applying that in variety of solutions,
variety of different ways. And so if we think about
how humans can do that and trying to mimic
that, here’s an example of how a child might learn to do something or rather not do something,
I don’t know if you all know what’s about to happen, I
actually did this as a child, I burnt my hand on a stove. Now what’s remarkable,
or maybe not remarkable, I didn’t need to do it a second time, I didn’t need to try another time. Now an AI solution or machine
learning solution might need to try a thousand different times. Try the spot in the back, try the spot in the front, try it a
different time of day. I didn’t need to do that. I didn’t need to try it
again at my neighbor’s stove. When we got a new stove, I
didn’t need to do it again because I had learned very quickly from this traumatic
experience that anything at counter height that you cook
on with metal that felt warm as you approached it was not something you put your hand on a second time. Now if we want to do that with a machine learning
type scenario, right now that can be very difficult if
you don’t provide thousands of different examples. So another example, taking this context idea
a little further is, you know, I’ve, I’ve
learned this as a child in this scenario with a stove. But another example, my father
used to race motorcycles, and the pipes and exhaust on
motorcycles can get very hot. Now you would, you would
hope that I wasn’t gonna, I’m not gonna show you another
traumatic event because, as a child, I had learned
from the first event that something that I approach
that’s metal, that’s warm, that I’m not going to touch again, that I’m not gonna try this again. And taking that context from very specific to more general to even more general is, that expansion is
considered contextual learning. I don’t need to have an
exact same situation, I’ve learned by context to apply that, the, what I can now recognize at that event. Now we would like to get AI
machine learning closer to that. Now, unfortunately, we’re
not quite here, there yet. There are some things we need to do. Now it’s probably pretty clear
from the previous example that missing contextual
information is a big deal. AI is very good at very
specific well-defined tasks but it struggles sometimes with ambiguity, with missing information. Machine learning often doesn’t use some of the most predictive
elements that we know of, which are relationships, especially if those relationships are
hard to reach or process. The other thing is that
processes that we have are mired in these older layers of complexity. A lotta this math has
been around for decades. We’ve done very good
recently with optimizing some of the hardware, and Jake will talk about process optimization,
but we haven’t used context to optimize that higher
level process overall. There’s also a lotta missing
context in knowledge systems. They’re missing that adjacent
information, getting the right data for the right decisions. If we think about the FOB example, it worked, the software
worked just splendidly, but it was the right solution
for the wrong problem. It wasn’t appropriate for that situation. So the other area that is becoming more
critical is explainability. So we often can’t explain, especially when we’re
talking about deep learning, why or how a decision was made. And so without context, that
can be really difficult. And that’s an area Jake’s going
to talk a little more about as well and how that can
really help us move forward. So graphs enhance artificial intelligence by providing context. So if you take nothing
away from this talk, take away the idea that adding in that rich
layer of information that’s all around us into your AI solution so that you can have better AI outcomes. And so there are four
different areas you heard about this morning that we’re going
to talk a little more about, and the first one is
connected feature extraction. How do we get connected features when we’re talking about
predictive features? How do we get those connections so we can have better accuracy? I will talk a little bit about graph-accelerated machine learnings, so model optimization,
so you can have context for better efficiency. We’ll talk about knowledge graphs as well, which I think there’s a
lotta information out there. This one, I kind of get
excited about, so it’s context for decisions, and I always like to add appropriate decisions
and decision support. So there’s a difference between accuracy and appropriateness. It’s good to have both when you can. And AI explainability, so that’s bringing
context for credibility. So we’re gonna dive into connected
feature extraction first. There we go. And when you think about
this, it’s kind of fun. I mean, I always think about, you know, sometimes it’s who you know. So I think Hilary must have
stoled our example on this. So I’ll give you a different example. So if you have a lotta
relationships, second-degree, third-degree relationships with people that perhaps have, have
committed some kind of a crime, that’s a better indicator of you potentially having a criminal activity than some simple demographic. So, you know, very simply, I think with Hilary’s
description on the smoking, it’s also been proven with, with other, other aspects as well that you wouldn’t think would naturally correlate with relationships. Unfortunately, current machine
learning methods use vectors, matrices, tensors that are
pretty much built from tables. And I’m sure you’ve heard in
other lectures the difficulties of tables displaying and
holding up relationships with the multiple joins especially as you, you try to add in more. And these methods pretty much
try to abstract, simplify, and sometimes just
completely leave out entirely these predictive relationships
and network data. – I just wanted to point out one thing in the visual you’re seeing. What you’re looking at
are not tensors, right, you’re looking at tensor
spaces, or the spaces in which a tensor could occupy. You could have simple
one-dimensional tensors, basically vectors, moving
into multidimensional tensors. But I’ve got my phone up here. I wanted to make sure
I get it exactly right. The Wikipedia definition of
a tensor is a tensor is a geometric object that
describes linear relations between geometric vectors,
scalars, and other tensors. I think the important thing to focus on there is linear
relationships, right. Vectors, tensors, matrices are good at showing one degree of relationship. But that’s not the entirety of what a predictive
relationship is, right. It doesn’t take into account
more complex networks. It doesn’t take into account
multiple hops of relationships, but those can be very predictive. And where a graph fits in is
to help, one of the places at least, the lowest hanging
fruit, is to help deliver those predictive relationships
into your current tabular or tensor-based model. – Yeah, and you can do that
without actually having to alter your model and
your other algorithms that you’re using as
well, so adding those in without having changed what
your pipeline might look like. Graphs also are pretty good
at inferring relationships, so I’ll talk about that in a moment, where relationships don’t
exist or are very sparse. By using those second, third-degree hops, you can infer relationships
even if there are distance that don’t directly exist. So very good with that. And we have a couple different
methods that you can use to do connected feature extraction. You can use engineered
features, so those are looking at labels or inferred
relationships where you know what you’re looking for. I wanna know how many, at four hops, how many people
have a fraudulent account? And I want that, I want
the count and I need to abstract that and then
use that in another way. Graph algorithms are really
good at finding things where you, you kind of
know what the structure is but you don’t know exactly
what you’re looking at. So an example would be, I wanna
find the node in my network that can reach every
other node the fastest. So I know what I’m, kinda the
structure I’m looking for, but I am not sure exactly
what the node attributes are. Another example would be influence. Centrality algorithms are used
quite a bit where you want to know the influence but
you don’t know exactly what, you know, what that’s gonna look like or what the node’s gonna
look like once you find it. And then with graph embeddings,
that’s kind of only on the, on the other side of
being more unstructured on what you’re looking for. In that case, you’re taking
all of the relationships of the nodes and trying to
abstract them numerically and then use them in
something like, like a tensor. Now an example we’re gonna show is in a financial crime scenario,
fraud prevention scenario. And finance is a great area to
look at because there is just so much data and is so highly
related, and especially in kind of fraud or criminal
scenarios when the attempt is actually to, to hide using multiple hops and multiple network relationships. So with this example, this
is a very toy example, it’s a very small example, we might have a couple
of accounts, of course you have hundreds, thousands,
probably, accounts. But somebody perhaps
is, is looking to make some kind of application,
a financial application, whatever that might be. And the immediate thing you’d
wanna do is well, let’s look at who they’re connected to. And you can really quickly look
with just these couple hops and start to see some interesting, some interesting
indicators, people using the same social security number,
same phone number, you know, it’s kinda looking a little suspicious. And so at four hops,
or excuse me, two hops, we can see four connections
already that might be marked as fraudulent or potentially fraudulent. And if you go out just another hop, we’ve added three more
four-hop connections. So pretty simple, you can
imagine you wanna do this with a lot more features, you wanna do this with
a lot more attributes. But you can see very quickly
how it would be easier to follow the hops and
find your connections than to do a complete brute force. Now this is kind of an
interesting scenario. Once you’ve found this
or you’ve identified this fraudulent potential network, you can then do further
algorithms on it to look at, well, who is the biggest influence here? What features are the
most influential in this? You could project out to a subgraph and then do some analysis on that. We have several customers
that use page rank, actually, to look at their models
and try to figure out which feature is the most
influential in their model. So they don’t even care
what the feature is itself, they’re just, within that, find me the most influential feature,
and then those at the top 10 or whatever, we’ll look at, and we can disregard the other features. – And another thing that
you can do as you move from investigation into a
pipeline, right, the goal of data science in general is to understand the predictive
relationships in the data and then move it into
something autonomous. Really we wanna be able
to detect these things before they happen. A simple example of an engineered feature in this case is simply running a count. At every connection, how many known money
launderers are there, right. And in this case, what you might find is you’re not connected to
any known money launderers at one hop, but at two
hops, there’s multiple, and at three hops, you’re
seeing this pattern. And what you can do is engineer features to build into, be it a
rules-based engine, be it a random forest, be it
any other model to look at how many known fraudulent
accounts at one hop, at two hops, at three hops. And those tend to be
highly predictive features that are close to impossible to get out of any tabular data structure. – Great. And then what we can also
just quickly take a look at is the machine
learning graph algorithms. So you can see last year, we debuted our graph algorithms library. I had all of, you know, all, had your standard graph algorithms
areas, and then this year we’ve also just about
doubled inside in algorithms. Pathfinding search, all about finding optimal routes to things. Centrality, all about
looking at what’s the most important function or important node within your network. It could be a bridge,
it might be something with the most influence. And then community detection as well, often used in AI machine
learning as well to try to think, how do things clump together, and how alike or dissimilar they are. We’ve added quite a few of
similarity algorithms as well. We’ve got an ebook
available online with those, and then early next
year we’ll be coming out with a graph algorithms book
with O’Reilly as well, so. – The last thing I’ll add
on that is this is something that we’re consistently
looking to expand, to improve, whether that means you’re
looking at running an algorithm and you need help doing
that or you’re looking for a new algorithm that you believe should be
supported in the graph. We really need that customer
and user feedback to understand what your priorities are. So please reach out to
us to both understand how to better leverage these algorithms but also to help us prioritize
what algorithms come next as we continuously release new ones. So Amy talked a bit about
one type of context in AI, but there’s, there’s many
types of context, right. We were looking at what are the contextual
predictive relationships you have that might help you classify
or predict a next action? And I wanna talk about the contextual way in which you make decisions, in which people actually think, right. Hilary this morning talked about how neural networks are
an incorrect approximation of how the human brain thinks. But it’s not always really
understanding exactly how neuroscience works,
it really, it’s using them as a pointer to how can
we speed up and optimize our own workflows. So one example that I was
thinking about for this is, I don’t know if there’s
many comedy fans in the, in the audience, but when
I think of Steve Carell, I think of lamps. And if you’re a fan of
Anchorman, if you’re chuckling, it’s, you know Brick loves lamp. And if you haven’t seen Anchorman,
and you think I’m insane, I promise, its, you
should go check that out. But that’s a, that’s a relationship. But if someone asked me
something a bit more intensive, and they say, “Who does
Steve Carell remind you of?” I’m not going to run that K
Nearest Neighbor Classifier of Steve Carell against
lamps and against chairs and against abstract concepts, right, I’m gonna go through a
process of filtration to think of, okay, let me think
of what are the attributes that they share to narrow
down, narrow down, narrow down, and then I’ll run the more computationally intensive aspect there. So I’m gonna obviously
start with people, right, and then I’m gonna move into he’s a male. I might look at complexion. I might look at non-physical features. He’s an actor, he’s a comedian. And where I’m probably going to get is, if you’ve never noticed this,
Steve Carell looks a lot like Ben Stiller. But I didn’t just do this
in a brute force algorithm of let me look at every
image I’ve ever seen and every other image. And that’s actually fairly close to how modern neural networks do
computer vision algorithms. They’re going through, and
those, each layer is looking at, even though it’s not generally a human understandable concept, it’s looking at filtering down. The machine learning algorithms that a lot of you are putting in practice in your enterprises though
are not always as intelligent. And sometimes, the filtration process that you’re going through
is really inefficient. So we wanna talk about
ways in which the graph actually helps you speed some of these up and get into production faster. And speaking of getting into
production faster, Databricks and IDG had a study come out earlier this year about what
the biggest roadblocks to scaling AI in the enterprise are. Oh, my phone wants to talk to me, yeah.
– Your phone’s talking to you, yeah. – And, one of the things
really surprised me. Some of the things I’ve seen,
Gartner had another study that said 55% of enterprises
are piloting AI this year, only 5% have something in production. Databricks came out and said only one of three AI POCs are
making it into production. And when they looked at what the biggest complaints were, tied for first was the iterative
time, the time it takes to iteratively train a model. And data science is an
iterative process, right. The more times you can try and
tweak something, the better it’s going to get. That’s basically a rule. What really shocks me
about this is it was tied with not enough data. If you either are a data
scientist or work with one, you got, we all, we complain way too much about the data not being
good enough, right. It’s the consistent complaint. But one of the actual biggest
problems is, well, is just how much time it’s taking to
work through these iterations and to train well. A lot of that is because you’re not taking really efficient processes at heart. We’re doing what Hilary did when she held up the tiny chip, right. If, to Amy’s point, neural
networks have been around for 60 years, why have they
taken off in the last 10 years? Mostly because GPU is just
started to be leveraged for highly parallel applications. You’re waiting for more hardware. Before Neo4j, I worked at Intel. We loved everyone waiting
for more hardware. But there’s more to optimization than what’s the next Kudu release, right. And there’s definitely more to optimizing your machine learning than renting more Kudu clusters on AWS, right. Start thinking about, where
are the inefficiencies in my model? When I think about why Neo4j
exists, how we got started, I think the general problem
statement was we exist to solve the computational inefficiency of connected data,
right, of relationships. What’s the most common where relationships manifest themselves? It’s in table joins. So if you get nothing else from me and every other talk here, I just want it to be table joins, table joins,
table joins, table joins. They’re terrible, stop doing them, right. They’re great if you’re
just exploring data, but if you’re building an application, you should be looking at a graph if table joins are your bottleneck. But they’re not the only bottleneck. So, obviously, if you’re seeing them as you’re bringing in data,
but what are some of the ways that that inefficiency manifests
itself in data science? One of them is what Emil talked about this morning, sparse matrices. If you look at some fairly common and well understood
machine learning practices, we talked about collaborative
filtering a decent amount, that’s really a giant
sparse matrix, right. I’m a power user of Amazon, maybe I bought a million products,
well they have a billion. So at a high level, the matrix that represents Jake is
999 million null values and a million positive values. And obviously there’s,
there’s ways that you can chop that down, but no matter what you do, those zeroes have cost. And compressing them, ’cause there are definitely sparse
matrix compression methods, those are just more tables,
they’re more indices, they’re more lookups, they’re
more complexity in your model. Graphs don’t have null values. We just have the dot values. We just have the relationship values. So where you see yourself spending a lot of time compressing matrices,
that’s another place you can start thinking, I
might have a graph problem. When you start seeing
directionality, right, multiple directions, multiple hops, but not just one linear
direction, not just that tensor, or you have multiple paths. There’s a reason so many
logistics companies use us, right. Graphs are able, we have
directional relationships. When you start seeing
you’re getting bogged down with so many different
trees, so many different basically multidimensional matrices to track these directions, you
might have a graph problem. And then, finally, the one I wanna talk about a
little bit more, closer to the Brick loves lamp
example, is brute force. There are lots, so brute
force is just the concept of I am going to run this calculation
against all of the data that I have, or I’m gonna
do some manual subsetting, or I’m gonna pull in some
kind of a Bayesian inference to try and subset it. What those generally, those
processes generally have in common is they’re fairly manual. They generally require a data
scientist to put in practice, and they’re slowing down your iterations. Why, why is it so hard, when
in reality, what you can look at is clustering based
on relationships, right. If I’m looking at a KNN, if
I’m looking at classifying something via similarity,
I can probably just look at any one that has any shared
attribute, or how about anyone that has a shared
attribute of someone I know or someone they know, right? In a graph, the query to
say return me a subgraph of only people that share any attribute that I have is trivial, it’s milliseconds. And what you can do, if you take this visual PowerPoint example
of a recommendation graph, and you can imagine that
this graph might expand out into infinity, instead of
running whatever classifier I’m running against all of those nodes, I can close to immediately
project this subgraph, I could do it in an automated
fashion, I can do it in a, sometimes it’s better
to just do it in the, in the dumbest fashion possible. I can look for specific features,
I can use graph algorithms to tell me what features to run against. But what I can do is use it
in realtime consistently, and then from there I can
run my classifier to set up those weighted relationships of whatever I’m looking at, right,
whether it’s a classifier or whether it’s a regression. But only compare things
against things that are likely to have anything to do with them, right. The graph helps you
speed up that pipeline. So, I wanna talk, move on to… Oh, so yeah, just coming back through, just to think through, when what you see that you might have a graph problem in an inefficient data
science workload, table joins. I’m always tempted to just
get up and just, like, try to lead a table joins
chant, stop doing them. But also sparse matrices,
directional problems, and brute force is,
it’s really as inelegant as it sounds, right, you
should stop doing it. So moving on to how we’re
creating the context for decisions that Amy was talking about. One of the AI spaces, data
science spaces that’s moving into production the fastest
is decision support, right. It’s where you’re taking a human decision that requires that person to have the right contextual
relevant information and trying to automate it in some way. Now there’s lots of things
that you can automate out that are mundane tasks,
but that still leaves you with the hard tasks. And generally, hard tasks are solved today by domain knowledge, right. You just train more experts. You just build up more experience
so that someone can say, I’ve seen this problem
before, I remember it. There’s really no reason for
that to be the case anymore. And really what we’re talking
about is knowledge graphs. And I wanna at least try to explain, define knowledge
graphs, how I define them. It’s a term that gets thrown around a lot. I think of knowledge graphs
as a connected dynamic and understandable repository
of different data types. So pulling that apart quickly,
it needs to be connected, it is not a data lake, right,
it needs to be connected around the attributes
that are actually relevant to this knowledge. Knowledge is not, data
is not knowledge, right. Not all data is knowledge. You’re looking for relevant information that’s connected on contextual lines. It’s dynamic in that the
graph itself understands what connects these things, right. It’s not looking at,
you need to go through and manually program
every new knowledge piece that comes in. It’s able to make those
associations across the attributes that are important to you because you’ve already programed that in. And it’s understandable,
sometimes you say it’s semantic. The knowledge tells you what it is. There is intelligent
metadata associated with it so that you can traverse this
graph to find what you need to answer this specific
problem even if you don’t know exactly how to ask for it. In general, in different data
types, I generally tended to find knowledge graphs is
it needs to be heterogeneous, it’s not just you have all of
the same data type in here. In general, there’s different
ways that graphs can help you with that, but I wouldn’t
call that a knowledge graph. We need to be looking at
multiple different types of data. And the last part of the
definition, a knowledge base is not a knowledge graph, right. Having everything in a data
lake is a fantastic thing. It’s a, you need it to get started. But if you can’t figure
out what something is and where it is and how it’s
connected to other things, that, that is not a graph. So instead of talking about
one example, since I think we’ve, we really are the de
facto leader in this space, most of the most common knowledge
graphs people talk about other than the Google knowledge
graph are built in Neo. Instead of giving one
example, I wanna talk about the three major
types of knowledge graphs that we see in the market. And these aren’t Neo4j
canon terms, these are, these are just my terms, so you might not, you might see them slightly different. But I tend to think of these
three types of knowledge graphs of context rich search,
external insight sensing, and enterprise NLP, I
like to think of them as basically a hierarchy. The first one we’re gonna
talk about is the hierarchy of your data types, what
are the types of knowledge you’re looking for, what
are the types of documents that we’re trying to traverse. The next level, I tend to think
of as what are the entities, what are the things,
these specific objects that we know are important,
and what’s all the information that’s being generated around them, and how can we generate insights from that in a non-prescriptive way? And then the last one is getting into the actual granular
level of your language. It’s, the biggest problem in scaling NLP in most enterprises is
that you’re not talking about natural language, right. The natural language part,
there’s a lot, there’s a long way to go, but that’s a, that’s
a fairly well-solved problem. Google is not helping you to
find your own ontology, right, your technical terms, your product names, your product families,
your industry acronyms, your synonyms, your common misspellings of those technical
things, your part numbers. These are all technical
terms that are custom to you, they’re enterprise language. Where we can step in is to
help you map those together and build that ontology. So just giving a little more info on each, context rich search
doesn’t actually just have to be search, but if you, if, if you are looking at just doing TFIDF, term frequency-inverse document frequency, if you’re just looking,
doing keyword document search across large corpuses of
heterogeneous knowledge, you’ve probably found it doesn’t work that well. What a knowledge graph can allow you to do is understand the context
and how things are related and traversed that much faster. But there’s non-search
applications as well. We’ve seen them used in customer support. If you’re in an enterprise where you get tens or
hundreds of thousands or more complex technical
support issues a year, being able to show a technical support person, what’s the most similar
problem we’ve ever seen, how is it solved, and what are the associated
knowledge documents and technical documents to this problem can accelerate
resolution enormously. Where I tend to talk
to people about looking for context rich search,
identify a process in your enterprise that
you know is inefficient. Measure that inefficiency
and build a knowledge graph and make sure that you’re actually targeting moving
that KPI up, right. Building KPIs in around
AI workloads is critical. So look for that inefficient
knowledge worker process and start asking the
question, what do you need to answer this question
better, and build that into your graph. External insight sensing is,
generally where we see this is when you’re combining
internally known entities and the knowledge around them
with external information. So we see it in things like supply chain risk,
right, the ability to look at all of your suppliers,
all of the places they have manufacturing, all
of your, your supply lines and associate those with
disruption risk, being able to tell that your competitor is being
tweeted about buying one of your suppliers and flagging
that information, being able to look at a natural disaster that’s hitting a specific
place that you have a, you have a supply chain
and giving someone options of what are similar
suppliers that you have. Or looking at market
opportunities, looking at financial information,
looking at investment management, or excuse me, investment banking,
being able to understand, what are the types of companies who, who’s talking about buying
who, so that you know who you should be talking to,
who you should be targeting in your CRM to go sell services. But in general, you’re looking at, there’s an enormous amount
of information in the market, we need to be able to sense it, determine what’s contextually relevant, and present it to the right person. I’m gonna end finally at enterprise NLP. I already jumped ahead of
myself and talked about how we can help build, Hilary
Mason this morning talked about word embeddings, there are lots of fantastic word embeddings libraries, all the CSPs are getting better, they’re probably not doing
it for your business. They are using graphs to
build them, so should you. And then the last one I wanna
talk about is a bit more on the research side. It’s another thing that was talked about this morning, explainability. And it’s, it was kinda funny
at the graph talk, one of the, she didn’t talk about
adding explainability to neural networks. There’s actually a lot
of research going around around graphs, and specifically
graph databases, to start to bring that explainability
to complex neural networks, and I, I do think it’s
important to recognize, when we’re talking about explainability, we are talking about deep learning. In general, even complex decision trees, if you get into a, a large random forest, their outputs are still individual layers, you can still see feature
and coefficient, right. It might not be business analyst readable, but you can get to an explanation. Complex neural networks
are not explainable to the data scientist who wrote them. So I wanna give an example of that. There was a, an interesting
example last year in academic setting of a
professor who asked his class to build a classifier for
photos of wolves versus dogs. And that’s a very hard problem, right. Those are actually pretty
similar, even though it’s a snarling wolf and
an adorable Vizsla puppy, it’s a pretty hard thing
to be able to identify. One student actually did
a really good job though. And he was getting really
good values on labeling things as a wolf versus a dog. And no one else had been able to do this. And the professor was trying to understand how did this happen. So one of the methods that people use to bring explainability is
called perturbing the model. You’re gonna take in a known dataset, you’re gonna feed it
in, and you’re gonna see how your classifier reacts to that dataset to try and abstract, okay, I
changed this specific feature, I can see how, where this is going. It’s a pretty cumbersome process. But he was able to perturb
it and find pretty quickly what was going on. Fit in, not this exact image
of an adorable puppy helping his owner clear the driveway,
but something similar, and what they found was,
every time that came out, it was coming out as a wolf. Give me my stamp, there it is. And then, in whatever, you
know, dystopian bureaucracy that you can imagine that
the algorithm says to do this to this wolf would then
happen to this adorable puppy. And what he realized was really similar to what Hilary was talking
about this morning. So French fries near the water, snow and a dog equals a wolf. That’s obviously not right, but
in order to understand that, you had to feed through
known datasets, figure out where the problems were,
manually go through and look at these, go
back and piece it apart. And this is a professor
of computer vision, right. This isn’t, this is, if
you start thinking about, we all talk about these all
the time, right, healthcare, credit risk scoring, approvals,
crime detection, sentencing, like the list goes on
and on for the things that people wanna apply deep learning to, if you can’t tell that
this puppy is not a wolf, you probably shouldn’t be
recommending sentencing yet. So, so what are the different types? And it’s important to break
these problems down into, you know, the type of problem
we’re talking about, right. It’s not just, it’s not just
explainability for all AI, ’cause not all AI is inexplainable,
is inexplicable, right, it’s deep learning. Within deep learning, what are
the types of explainability we’re talking about? What is explainable data? What data did we use, right? Can we even understand
how this was trained? That can actually be a
lot harder than you think if you start talking about some of the major cloud service
providers telling you what data they, what data
Facebook has used exactly to get to all of its algorithms. It’s actually not that easy of a problem. Explainable predictions,
that’s where we’re talking about things like perturbing the model, where you’re looking at, the algorithm itself is inexplainable, but I
wanna understand specifically why it’s giving me this value, let me try to piece this apart. And then the panacea,
explainable algorithms, so algorithms where you can actually get those coefficients out,
and the important part of this one is those already
exist in deep learning, they’re just not performant. And we already talked about
how hard it is to get more GPUs and how great NVIDIA stock
price is doing right now. Performance matters, so
limiting performance is probably gonna be a non-starter as well. So how are graphs starting to
be looked at in this space? Explainable data is an easy one for us. Most, we have 20 of the top 25 financial institutions using us in some way, data lineage is a pretty common way, right,
the ability to understand how data was changed, where
it was used, why it was used, who used it, is something
that we can do really well. If you’re pulling that data
in from a knowledge graph, it can also tell you what
that data is, right, it’s, it’s explainable, it’s got
semantic understanding. So that’s, that’s low hanging fruit. Explainable predictions,
there’s something cool that’s come out very recently. A partner of ours,
Fujitsu’s AI labs, this is, this photo is actually from some of their public information,
if you search for deep tensor and explainable AI, they’re
using knowledge graphs to help get to explanations
for specific predictions. So this particular case is
taking genetic mutations and looking at them to
try to associate them to specific diseases in ways
that haven’t been done before. And when you’re combining
hundreds of thousands of mutations with equal amount of
diseases and multiple layers of classifier, that’s
very difficult to do, except what they did is
they built a knowledge graph that included each one of
those mutations, included those diseases, and it also included all of the academic information that is, that describe those diseases,
that talk about them. And they set up that semantic
connection around, look for mutations, look for
diseases, look for the things we’re looking at. And what happens is you
can still see the neurons, in this case the nodes, that
are used in a neural network. What you can’t tell is how they were used. But if you can see which
nodes were actually used in a prediction, you can then
traverse your knowledge graph to say, oh, well, this makes sense. I, you know, see that this, this research paper is associated
to this research paper, and actually I can see that this genetic mutation
is being talked about in something similar, right. So it’s still not very
elegant, but it is more elegant than just trying to feed in
more data to perturb your model because you, you don’t have
the photos of more dogs to feed in that are
labeled for this, right. You’re looking at unknown patterns. If you don’t have known patterns
to perturb in that model, it can be close to impossible to do that. So the knowledge graph is
allowing researchers to say this is, these mutations are
associated with these diseases, and here’s all the documentation
that abstracts it why. And then, finally, we go
to explainable algorithms. This is a bit further out. We are starting to see
research, and Fujitsu’s actually working on this as
well, of actually constructing your tensor in the graph, making sure that linear relationship is
actually weighted relationships between nodes, which it had,
is showing signs of being able to actually get at the explanations and the coefficients at each layer. But I don’t wanna
overpromise that too much, ’cause I actually haven’t
seen it working yet. But that, that is, that
is some of the cool, fun, and exciting stuff that
we’re working on right now. So with that, I am gonna pass it back to Amy to close us out.
– All right. Thank you. So I think by now you’ve hopefully understood the
theme is context for AI, and in particular to, to just remind you that aggregation of information
is not intelligence, nor is learning without context. That’s not intelligence and comprehension. So think about, you know,
think about what you’re doing. And if you’re still asking yourself, “Will graphs help me? “Will graphs help my AI
or my machine learning?” There are pretty much,
you know, four areas, I think we covered them,
but just quickly to, to summarize, is look at your model, do you have predictive relationships
or network components that are predictive that you can’t reach? You’ve got to two hops but
you’re trying to get to three or four hops out and look for those indirect
inferences and connections. Are you, you know, dealing
with a lot of joins? I think we talked a lot about,
we think we’ve flogged that. – [Jake] Table joins, table joins, table joins.
– Yes, no joins, joins are bad. But also if you have a
lotta sparse matrices or, or other highly indirect
components in your model. Also, directional, we talked
a little bit about that. That is such a classic graph
application is understanding paths and understanding direction and how things spread through a network. And, of course, the knowledge sources, if you’ve got a lot of
heterogeneous knowledge sources in stories over, or sources
all over that are hard to integrate, that would be
another indication as well. So, with that, if you guys
have, what we really need for you, so we wanna help
and we wanna work together on new solutions, but what we really need from you guys is come tell
us what algorithms aren’t in the library that you
want, and come tell us how we should integrate with
the workflows that you have. So thank you everybody. Come find us afterwards for questions. (audience applauds)

, , , ,

Post navigation

2 thoughts on “How Graph Technology Is Changing Artificial Intelligence and Machine Learning

Leave a Reply

Your email address will not be published. Required fields are marked *