Truist AI Symposium Workshop: Data meets deep learning for in silico biologics with Absci
Feb 10, 2022
Have you ever wondered how AI can help discover better drugs, but get lost in the complexity? The Absci team (founder & CEO Sean McClain, CTO Matthew Weinstock, and Lead AI Scientist Joshua Meier) recently helped demystify this question and more in a fun conversation with Robyn Karnauskaus of Truist. Watch this video to learn more about how Absci is combining synthetic biology and AI to develop next-generation biologics.
Transcript:
Robyn Karnauskas:
All right. Good afternoon, and good morning, everyone. Thanks for joining us. I know it’s been a really busy morning, but I have a disclaimer to read first. Before we begin, I need to read the following disclaimer. This call was arranged by Truist Securities Research for use by institutional investors and issuer clients as defined by FINRA. If you’re not an institutional investor or issuer, please disconnect at this time. For required disclosures, please see our website at truistsecurities.com or Equity Research Library. So, I’m super excited today to have on this workshop one of our first in a series of AI workshops, Sean McClain, the Founder and CEO and Director of Absci, as well a lot of his colleagues. So, I’m going to turn over to Sean and we’ll begin the discussion. We hope will be really helpful for a lot of investors out there with various different degrees of knowledge of AI. Sean, over to you.
Sean McClain:
Awesome. Thanks, Robyn. And welcome to the Truist AI Symposium Workshop. I’m Sean McLain, your host today, and also Founder and CEO of AbSci. In today’s program, we’re going to be demystifying AI and machine learning in drug discovery so that you can invest with greater confidence and obviously impress your friends at a dinner party. To achieve this goal, I, and obviously make me look good, I’m joined by three amazing industry luminaries. First off, we have AbSci’s Chief Technology Officer, Matthew Weinstock.
Robyn Karnauskas:
Hi, Matthew.
Matthew Weinstock:
Hi. How’s it going?
Sean McClain:
Second off-
Robyn Karnauskas:
It’s going.
Sean McClain:
โฆ is AbSci’s Lead AI Scientist, Joshua.
Joshua Meier:
Hi, everyone.
Robyn Karnauskas:
Hey, Josh.
Sean McClain:
Hi. And our one and only, Truist Securities’ Robyn Karnauskas. All pushing the forefront of AI discovery. As you all know, AbSci is changing the game in drug discovery for protein based therapies, and it’s a really fun, cool place to work. So, we thought that we bring some of that approach to today’s program. While we love getting tons of questions from our analysts, we’re going to flip the tables today and ask Robyn a few questions and then we’ll all grill each other. It’ll be a great time today.
Robyn Karnauskas:
Great.
Sean McClain:
Robyn-
Robyn Karnauskas:
Go for it.
Sean McClain:
The question is, are you ready for this?
Robyn Karnauskas:
I’m ready for this. I’m ready. I’m ready, I will answer with pure honesty. Go for it. Go for it, Sean.
Sean McClain:
I love it. All right. So, here is the first question. Robyn, is AI really here to stay? And what do investors think of this space in really combining AI with biology?
Robyn Karnauskas:
Well, despite everyone, my personal opinion is AI is the wave of the feature, everyone’s going to be using it. We heard Amgen this morning mentioning it on their call. And people are going to have to learn to understand what AI is. So I think it’s here to stay, it’s going to be incorporated across the board in many different modalities. What do investors think? I mean, it’s a range. There’s some people who are super skeptical, I hear, “I don’t think a machine can develop a drug,” and then there’s people who are super excited, they understand the technology and they understand that this is sort of the way of the future and they’ll invest. So it’s the large gamut. One of the goals that we’re trying to do is sort of get people to understand the terms so that they’re comfortable with learning about AI and they can start doing their homework and evaluating whether or not it’s going to work.
Sean McClain:
Yeah. No, it’s really interesting you talk about Amgen talking about AI and using that in the call this morning, I mean, are you hearing of other large pharma companies talk about integrating and using AI? Is there kind of a spectrum of large pharma using it and those that aren’t?
Robyn Karnauskas:
Well, obviously, you have partnerships like with Merck, so there’s a lot of companies that are partnered with big pharma and big pharma is starting to talk about what they’re doing, but they’re not going into depth. Obviously, Pfizer will be attending our March 1st conference. And shameless plug by the way, shameless plug March 1st AI conference, Truist Securities. But basically, Pfizer is going to talk about how they used AI to sort of speed up drug development of their COVID vaccine. So I think investors don’t really realize how this is being used now, unless they’re invested in one of the companies that’s actually supporting or partnered with pharma, I don’t think they fully understand it yet.
Sean McClain:
Well, the really interesting part too is like they’re using it in various different ways, whether that’s for small molecule, large molecules or clinical trial design. We believe that AI is here to stay, and we’re super looking forward to diving in how we’re using AI and drug discovery today.
Robyn Karnauskas:
That is a good segue. So, I thought first we could get over some terminology and sort of clarify things for investors. So this is a question, I guess, for Joshua, our AI expert. Could you please explain to everyone, why is everyone so excited about AI, machine learning and bioinformatics? Why are you excited about it? You’ve made it your life purpose, I guess. And help us understand the different terminology. Is AI, machine learning, bioinformatics the same? Help us understand it since you’re the AI god.
Joshua Meier:
Yeah. Great question Robyn. So, AI, machine learning, bioinformatics, what’s the difference? It’s funny you ask that. I used to work at Facebook doing protein modeling at Facebook, and everyone always asked the question, why is Facebook doing biology? And it turns out that on surface level, AI and bioinformatics look pretty different, but when you get to the fundamentals, a lot of the math is actually the same. AI is all about learning patterns from data. And bioinformatics conventionally, has been about distilling insights from biological data. So on that fundamental level, it’s all about understanding what’s in your data and using that to solve some problem that you have. What’s really interesting too when you start to apply all these new advancements in artificial intelligence to biology is that AI has this property where it’s really hard to get working on the easy stuff.
Joshua Meier:
Take self-driving cars, for example, it’s something that was promised to us years ago and we still don’t have. It’s still very early days for the field. But on the other hand, AI’s actually a lot easier to get working on the hard stuff. You might think, “Biology is so complex. How is AI going to help us there?” Well, it turns out the baseline is just so much lower. You and I can’t look at a protein sequence and understand anything about it, but it turns out that the models that we’re building here, when we show them protein sequences, they’re actually able to distill really interesting insights, they’re very powerful for drug discovery. What’s really fascinating to me is when you dive into the mathematics, the algorithms that have conventionally worked best in AI and in bioinformatics are very similar.
Joshua Meier:
It’s almost like you’re getting to some kind of truth in the math of how do you actually learn. And that’s something that’s really been interesting to me throughout my life, this process of learning and how do you actually get information and do something with that. And there’s no better way to use my technical skills to do good for the world. Where else can a programmer literally be saving lives? It’s really an exciting space to be in. And maybe to get a little bit more color, many people are probably familiar with conventional AI, what we like to call supervised learning in the AI space. This is where you show the AI like picture of a cat and you say, this is a cat, here’s a picture of a dog.
Joshua Meier:
This is a dog. What might be more difficult to grasp, and this is really what’s happening now is something we call unsupervised learning. This is where we just give the model a ton of data. And we say word, “Here is information, do something with it.” This is something where we’ve made considerable strides in the field in the past five years or so. And it’s really fueled the AI revolution. And this is where massive neural networks, large amounts of compute and big data really come together. So Sean and Matthew, I’ll be honest that when you reached out to me, I almost didn’t reply back because I was wondering, what is this synthetic biology company doing with an AI person? What made you think of incorporating AI into the company and how does the technology fit in with what you’re doing now?
Sean McClain:
Yeah. No, that’s a great question, Joshua. AbSci has been around for the last 10 years, and we’ve built this really exciting synthetic biology platform that’s really our data generating platform, which consists of our SoluPro eco live strain that’s capable of producing large complex mammalian proteins. And we pair that with these breakthrough screening assays. And when you combine these two technologies together, you’re able to screen protein based drug candidates in the billions in a single experiment, looking at the protein functionality and the manufacturability, that’s the drug functionality and manufacturability all in a single experiment within the billions. And this is really compared to conventional screening methods and maybe screening thousands of protein based drug candidates in weeks or months. And so this gives us the throughput of data needed as well as the quality to really leverage deep learning.
Sean McClain:
If you look at AlphaFold, huge breakthrough in AI within biotech. I mean, I would argue that it was the one of the biggest breakthroughs in biotech and in AI over this last year. And they were able to predict protein structure from the amino acid sequence. But what they talked about in the paper was not being able to predict protein-protein interactions due to the lack of data. And that’s what we have here is that protein-protein interaction data, and that’s what’s critical for being able to design brand new drug candidates. Again, we have the data, the throughput and the quality actually needed to leverage the AI. I like to say think of this like the Google index search of drug discovery, Google’s never going to be displaced because they have the pipeline of data. That’s the exact same thing here at AbSci. And Matthew, I mean, I’d love to hear if you have anything else to opine on this.
Matthew Weinstock:
Yeah. Maybe just to add a little bit of color to what you said, when you’re going about discovering and developing drugs, there are lots of permutations that you want to explore, so lots of experiments that you want to run. And in order to run all the experiments that you want to do, you need to have assays that are extremely high throughput. Traditionally at AbSci, we were using those assays that we developed here to fish the needles out of the haystack. So, here’s a billion different drug designs, let’s find the few that actually have the properties that we’re interested in. And that works because we want to drive towards the ones that actually work. But we started realizing we have these assays that can generate huge amounts of data on not only the needles in the haystack, but the mediocre things, the things that are bad. And we can leverage all of that data in a learning process to actually help us predict things that we didn’t even test in the laboratory. And that’s where the AI fits in is it’s kind of going beyond what we’ve actually tested in the wet lab and doing those experiments using computational methods.
Robyn Karnauskas:
Matthew, can I just add a question, because I get this a lot from investors. People say, “You can’t just use a machine to design a drug, you need to combine machines with some experiment or physics base or wet lab.” What is your point of view on the requirement for what you need to make a machine help you develop drugs? What are the things do you need for that?
Matthew Weinstock:
Yeah. I think this is actually a really interesting shift in the way that people think about doing science as you bring AI into the picture, because obviously, you need the data. That’s going to be the theme that you’ll hear us harping on again and again as we have this conversation, I’m imagining. And really what it boils down to is that the machines are able to design these drugs because they can look at the data and they can see patterns that a human researcher wouldn’t be able to see, just because we don’t have the brain space to kind of see patterns of that level. And this is actually โฆ it’s a blow to egos. I think of a lot of scientists. We’ve talked to scientists, they haven’t even looked at our data and they’ll say, “That doesn’t work.” And you could imagine from their perspective, they’ve lived in poverty getting their PhDs.
Matthew Weinstock:
They’ve gone on to live in poverty in a post-doc or two, and then scraped by as they try to get tenure as a professor and then they finally try to build the name for themselves as the world expert in whatever disease or approach. And then like this upstart company comes along and says, “We don’t have the degree of expertise that you do, but we have these models that can pull out these patterns that can design better drugs.” That’s a blow to one’s ego, for sure. But it’s a shift in a paradigm.
Matthew Weinstock:
We’re not teaching these models everything that humans have ever known about biology and then asking a machine to run a calculation and design a drug. We’re showing them data in a way that’s unbiased, where we’re not teaching the machine what an amino acid is or what it means to have a positive charge or a negative charge. We’re saying, “Here’s a bunch of data now pull out the patterns that we aren’t able to see and teach us something.” That’s a different way to think about it, but it’s extremely powerful, and it’s the way that drug design’s going to be done in the future.
Sean McClain:
And it’s not even powerful. It’s absolutely beautiful to be able to merge biological data with AI and have AI teach us about biology and to increase our understanding in a way that us as humans have never been able to discover things about biology before. And to me, that’s really a beautiful thing when you see technology and biology being merged together to increase our own understanding of biology. And we’re already seeing that. I mean, our AI is discovering things that us as humans have never predicted before. And I think that is the future and that’s what we are seeing. And again, AI and biology, the merging of those two is really here to stay.
Matthew Weinstock:
Yeah. And I would add to this, it may be an interesting parallel. If you look at some of the other areas that people might be familiar with AI, it’s in the realm of games, where you can teach essentially an AI agent to play a game like chess or golf. And you talk to the world experts and they would say, “There’s no way a machine would ever beat me at this.” And then they start playing these machines and the machines make moves that are not intuitive. And they’re like, “Oh, look, it’s already made a mistake.” And then they realize as the game goes on, “The machine is actually thinking many more steps ahead than even me as a savant can do, and it’s already got me beat.” And that’s kind of the same approach with teaching these machines to develop drugs is that the idea is that they’re going to go beyond what humans can do and develop the better drugs.
Sean McClain:
Yeah. And not only can they look at one parameter, but the AI can look at like multiple parameters at one time. So, it’s not just looking at the affinity, but also is this drug candidate manufacturable, does it have the developability aspects that we want, does it have low immunogenicity? And so you can basically take all of these problems or different parameters when designing a drug molecule all feed that into one model. So, you’re able to get the exact outcome that you want with all the drug characteristics. And I think that’s also a really exciting aspect when layer on AI for Perkin based drug discovery.
Robyn Karnauskas:
Got it. Maybe we can shift the conversation just a little bit then and talk a little bit more about โฆ Maybe I can grill you now maybe a little bit. I want to make sure that investors get out of this a lot of clarity around some of the key questions that โฆ I’ll just give you the questions that they’ve been asking me and sort of you can hopefully answer. Let’s go basic, first of all. I know what a small molecule and a large molecule is. Maybe you guys can explain it to investors, because as you know, people who are interested in your story are not just biotech people, they’re also tech people, they’re also tools people, so they have various knowledge bases.
Matthew Weinstock:
You want to take that one, Sean, or you want me to?
Sean McClain:
Go for it, Matthew.
Matthew Weinstock:
Yeah. Yeah.
Sean McClain:
I would say it’d be interesting for you to explain the biology side and then Joshua to dive into the complexities on an AI front between the two.
Matthew Weinstock:
Yeah.
Robyn Karnauskas:
That’d be great.
Matthew Weinstock:
So, a small molecule versus large molecule, small molecule, these are things that many people are familiar with in terms of like pill in a bottle. So, aspirin is an example of a small molecule. It’s literally a small molecule. It’s simple enough that a chemist can synthesize it in a laboratory using chemistry sets, so think beakers and glassware and stuff like that. That’s small molecule.
Matthew Weinstock:
Large molecule is a class like in our case, we’re talking about protein based biologics. So, these are based on proteins which are type of bio-molecules that are much more complex than a small molecule. They’re so complex, in fact, that you don’t actually have a chemist synthesizing them. Mother nature is the chemist. You rely on biological systems like genetically engineered cells to actually produce these types of molecules. Examples would be insulin, which is a very common biologic that’s used by a large number of people, or maybe some of the listeners have heard of things like the molecule trastuzumab, which is used in treating HER2-positive breast cancer. It’s an antibody, that’s a class biologic. And those are large molecules. That’s kind of the difference in what we’re talking about.
Joshua Meier:
Yeah…
Robyn Karnauskas:
Go ahead, Josh.
Joshua Meier:
No, go ahead, Robyn.
Robyn Karnauskas:
I was going to say then taking a step further, I guess is a question for both Matthew and Joshua, there’s a lot of players in the small molecule AI space. So, I guess the question I have there are what are the challenges with developing small molecules using AI this way? And start with that. And why are there so many companies doing it that way versus say biologics?
Matthew Weinstock:
Yeah. This, I think, going back to the theme of data, the reason that there are a lot of companies focusing in on the small molecule space right now is because for decades, the industry has been building out an infrastructure to really focus on the discovery of small molecules. So putting in place large million member compound libraries that people could access to look for potential drug candidates, putting in place the robotics and the automation to screen these, building the necessary assays to actually collect the data.
Matthew Weinstock:
Large pie throughput, small molecule screening campaigns have been around for decades in big pharma. At this point, it’s a commodity, you can go out and pay for someone at a contract research organization to screen a million molecules for you and give you the data. There’s not an equivalent to that on the biologics side. The assays, they don’t exist out there in the public. So at AbSci, we’ve developed some very proprietary assays that allow us to generate data to do kind of the analogous sort of thing on the biologic side. In my mind, that’s the key thing, it’s access to data, but I’d love to hear Joshua’s take on that.
Joshua Meier:
Yeah. I mean, that’s definitely a big part of it. And I think the data actually is really interesting when you think about how you model a protein. So, small molecules, first of all, they’re just small. When you think about the history of AI research, makes a lot of sense to start with the easier problem. There’s just less to model, you need a smaller model because of that. And that’s one of the reasons why a lot of the early AI work in biology started with small molecules. But when you start to think about proteins, there’s actually a lot of really interesting data that you can exploit. For example, if you look at protein sequences, you can just go to nature and just collect dirt on the floor and there’s DNA in there, and you can sequence the proteins.
Joshua Meier:
And it turns out that data is actually really useful to our models because while you and I can’t look at that protein and understand anything about it, our models are able to see, okay, this is like what a protein looks like. And it turns out that’s really useful. One example of how we use that at AbSci is we have data sets of hundreds of millions of antibodies from patients and from animals and we give this information to a model and we say, “This is what an antibody looks like, can you give us a new antibody that you think would work in a human?” And that kind of data doesn’t really exist in the small molecule space today? Just the ability to sequence is also really powerful. So, if you want to take all this data and just read it and write it, we can do that with DNA.
Joshua Meier:
That DNA intermediary is actually really powerful in the protein space. And when we start to think about going back to the modeling challenges, proteins are just bigger. So, it’s a harder problem, but because of all that data that’s available, the ceiling is actually a lot higher. So, when you look at where the technologies are right now, I mean, it’s a lot newer AI for proteins, but it’s largely caught up to where we’re at with small molecules and it’s moving a lot faster. So, if you’ve seen things like AlphaFold, for example, the past year, really showing that we can predict the structure of a protein.
Joshua Meier:
People weren’t really interested in AI proprietary. This is too hard. And then AlphaFold comes along. They don’t even have like proprietary data and they’re able to do something like that. Now, imagine what we can do here with the hundreds of millions of data points that we bring online. We can go beyond just predicting the structure of a protein, and we can really start to design new proteins and we can design proteins that have certain functions or certain structures and really pave that way towards having models that design new drugs from scratch.
Sean McClain:
Yeah. That’s the really interesting piece here is that one of the biggest breakthroughs in biology came from, large check, this is AlphaFold and Google, that blows my mind, and again, it really shows us merging of tech and biotech coming together. And what I was even more blown away by was the research that Google, Facebook AI Research, even Salesforce is doing within biotech. I mean, Joshua, when we found you, you were at Facebook AI Research doing protein models. I would’ve never even thought Facebook would’ve even been remotely interested in this, or Salesforce for that matter. Why did you leave Facebook AI Research that you got all the compute in the world, you got a great job to come work for Absci? Some company that hadn’t even IPOed at the point in Vancouver, Washington, doing some substance bio stuff. Why the heck did you come and join us?
Joshua Meier:
Yeah. I mean, maybe I have to start by even answering why I was there in the first place. If you look five years ago, there was no one doing AI for proteins. There weren’t companies like AbSci that were really out there making progress. The problems were just too hard. And so these big tech companies start working on this and naturally, do you want to work on these problems, you end up being there. But now when you have a company like AbSci that is able to actually generate data, there’s so much more that you can do with that. At Facebook, a lot of the interesting problems we started to run up with were actually around the data. We got to a point where the models just became really good. When we wanted to get the model working on a new problem, we would take the same model and just change the data.
Joshua Meier:
And when I realized that was where a lot of the creativity was lying, it’s like you need to do this in a place that can actually generate data. So, having a company like AbSci where, I mean, Sean said it earlier, the Google of synthetic biology, a model where you’re able to build up those data sets that allow you to win, that was really attractive to me. And I think the second part too is just the impact that you can have. So in a big tech company, you can make really impactful results, you can write a paper like AlphaFold and you can have a big innovation like that.
Joshua Meier:
But at AbSci, we really have the potential to take these drugs, actually get them into the clinic. You can literally save lives with the programs that you’re building here. That’s something that you can’t do in a big tech company, they’re not set up to do that. They don’t have a business model that incentivizes bringing drugs to the market. So, I think being in a place where you have data, it’s just better from the technology building front and then also the impact that you can have with that technology is just so much larger.
Robyn Karnauskas:
Hey, Joshua, could you dumb down for people AlphaFold? For those that that term is not as well understood, can you just dumb it down for a layman?
Joshua Meier:
Yeah, yeah, of course. So, AlphaFold is a protein structure prediction tool. What that means is we have protein sequence data, protein sequences are really cheap these days. We’ve had huge strides in sequencing technology, that means as you take some protein, you take the stuff in a test tube, you want to just get out a sequence, bunch of letters that indicate what that protein is. That’s called the primary sequence. That’s just the letters of the protein, but you often want to know what the three structure of the protein looks like. So we have data sets of protein, 3D structures as well. They’re a lot smaller because it’s a lot more expensive to get that data. But you just set it up as a machine learning problem.
Joshua Meier:
This is the protein sequence, and this is the protein structure, and you train a machine learning model on that. And that’s been really exciting to the computational biology community, because it means that we can take sequences that we don’t really know what the 3D structure looks like. And now we have an algorithm that tells us what those structures look like. That’s a prediction tool. Where we’re going with AI right now is not even just prediction, we are going past that towards generation. And that’s what we’re laser focused on at Absci.
Joshua Meier:
We want to create new proteins, develop algorithms that don’t just predict what a protein does, but we want to create new proteins that don’t exist yet. If we’re creating a new drug, it’s not like predict which drug works, it’s we need to come up with a new drug from scratch. Or if we do prediction, we need something that’s so scalable that we can just predict trillions of possibilities. There’s more atoms that, as Sean likes to say, more atoms in the universe than possible protein variants that we can make โฆ Sorry, more protein variants than atoms in the universe. So, it its really involved…
Robyn Karnauskas:
I knew you’d use the word universe at some point in this. Every AI company talks about the universe.
Sean McClain:
…big.
Joshua Meier:
Because it’s so big, there’s so much-
Robyn Karnauskas:
It’s so big.
Joshua Meier:
โฆ to explore, and these algorithms are really cutting down the search base for us.
Sean McClain:
Yeah. But I think that’s the big difference, again, between small molecules and proteins. As Joshua was saying, there’s more sequence variance in an antibody than there are atoms in the universe. That is crazy to wrap your mind around. And that’s really why we see only a 4% success rate through the clinic because we’re only searching a small portion of the right sequence space for a given target or indication. And what AI’s going to allow us to do is search more of the right sequence spaces to be able to get the ultimate best drug candidate and increase the probability of success, increase efficacy. I mean, it’s super exciting.
Joshua Meier:
Yeah. I’m really excited about AI. I mean, this is, like Matthew said before, dedicating my life to this technology which is just so unbelievably exciting. But Matthew, I’d love if you could explain the process of biologic drug discovery and why you’re excited about AI and machine learning applications for drug discovery.
Matthew Weinstock:
Yeah. That’s a great question. So, at a very high level, the drug discovery process works like this. First, you have to find a drug target. The idea is you have a drug, you’re going to put it in somebody’s body, that drug is going to interact with some molecule in their body to bring a therapeutic benefit. That thing that it’s interacting with in their body, we call the target. And so there are campaigns to discover therapeutically relevant targets. Once you have a target, then you need to discover a molecule that interacts with that target. And that’s a process that’s called hit identification. It’s usually a process where you screen large libraries and try to fish out a needle in a haystack that gives you a foothold on the problem. So, it’s probably not the best drug that you find initially, but it’s something that kind of has the biology you’re interested in, it interacts with the target, and then you take that initial hit and you optimize it.
Matthew Weinstock:
So, you try to endow it with chemical properties that make it better. So you want it to be maybe more soluble, have a better half life in the body. You want it to bind more tightly to the target. And so you can kind of tweak the molecule to make it better through the process of lead optimization. Then ultimately, you need to figure out how to manufacture it. And this is an area that there’s a lot of challenges where you have a drug that seems like it’s a great drug on paper, but there’s not an easy way to make it at commercial scale where you can have enough of it to treat the population that could benefit from it.
Matthew Weinstock:
So there’s the manufacturing piece. Then you go into the actual kind of clinical development. So you start a pre-clinical with animal models to show that you have the effects that you’re going after. And then you move into people. First, trying to just demonstrate that it’s safe, that this drug you’ve come up with isn’t going to kill people or have bad side effects. But then ultimately, that it has the efficacy that you want. It actually, statistically improves the lives of patients in that population. And the thing to highlight is this is a very slow stepwise process. And the way I like to kind of think about it like really old school Christmas lights, when one bulb in the chain would go out and it breaks the whole chain. It’s the same story here.
Matthew Weinstock:
You can discover an amazing drug and then realize, “Wow, there’s not a way that we can actually make this at scale.” And that kind of defeats the whole purpose. And so the reason that I’m excited about applying AI to this process is because we’ve highlighted, number one, it’s attractable problem for AI. I mean, some of the data that we’ve put out there definitely shows the promise of AI in solving problems in this space. I think the chances of success of applying AI to solving these problems are very high. If you have the right data sets, the second piece is that AI can solve many of these problems simultaneously. So we can think about optimizing a drug. We can optimize its binding affinity for the target, but also maybe decrease its immunogenicity at the same time.
Matthew Weinstock:
That’s something that AI is really good at that can actually provide a lot of benefit immediately. And then the final thing for me that I’m most excited about is this is a way where AI can really make a meaningful impact on the world. I love my AI powered Netflix algorithm and the movies that it presents to me, that improves my quality of life, but not to the extent that applying AI to life changing therapies can impact people’s lives. And so if we can use AI to take a really slow expensive non-scalable process that exists in the lab and make it faster, make it cheaper, make it more efficient, spit out better drugs at the end, that’s going to really change people’s lives. So, on the question of AI, I’m going to ask you a question Joshua, as I’ve kind of thought about this as someone who was not an AI expert kind of raised in that space, as I’ve tried to become more familiar with it and learn, I’m seeing that AI has been around for decades.
Matthew Weinstock:
Even back right after World War II, people are talking about the promise of it and you have these hype cycles where it comes and it goes. So, why is it here to stay today? Is it because we have better computers now that can process more data? Is it because of advances in the models themselves? Is it the availability of data or is it that we have more people in the world like you that are trained to use all of the above to push the boundaries? If you could explain it in a way where my grandma or grandpa could understand it, that’d be even better.
Robyn Karnauskas:
[inaudible 00:33:53].
Joshua Meier:
I’ll try my best. It’s a really complicated subject. But I mean, really the answer is all of the above. Today’s such an exciting time to be working in this field because we’re really at the forefront of all those pieces. If we go back to the ’80s, for example, researchers had already worked out some of the initial versions of the models that we use today. In some cases, they even had data for those models, but they couldn’t get things to work in practice because they just didn’t have the compute power. So, if you look historically at AI, there’s usually been one of these ingredients that’s missing.
Sean McClain:
So you’re saying it’s in video that’s really allowed us to do what we’re doing?
Joshua Meier:
Yeah. I mean, the kind of chip that they’ve put out in the software that makes it so easy to program them has also been a really big boo to the industry. I started writing my first neural networks in 2013. And back then the tools were really not that good. It was really hard to build these neural networks. It would take weeks to just get an early version of model working. But today with the software that’s out there, we can just move so quickly and really focus on how can we been most creative with our data.
Robyn Karnauskas:
I’m feeling really dumb right now. Neural networks building neural networks? I’m feeling really dumb. It’s impressive, Joshua, actually.
Joshua Meier:
Yeah. I mean, the technology’s impressive. I’m just amazed by this stuff every day that we’re even able to do this.
Sean McClain:
What is a deep neural network? We have deep learning versus machine learning, why don’t you just dumb that down for somebody my myself.
Robyn Karnauskas:
Like Robyn.
Sean McClain:
Yeah. Like Robyn and Sean, how do we do that?
Joshua Meier:
So, the textbook answer is they’re universal function approximators, but I think that’s the wrong answer.
Sean McClain:
Okay. That’s over our heads.
Joshua Meier:
But these neural networks, they’re like our brains, they’re these generic computation devices. They can learn to predict things. I think may maybe a way to understand this is if you look at where programming has been, pre-deep learning, you write a program to solve a task. So if you look at the earlier versions of protein folding algorithms, protein structure prediction, people were implementing biophysics. It’s like let’s implement some force fields and let’s try to use some physics to solve what the 3D structure has to look like.
Joshua Meier:
Deep learning is about having the computer write the code for you. So, instead of saying, this is the data, this is my program, this is my output, you say, “This is my input, this is my output, give me a program.” And that’s really what a neural network is doing. And we call them neural networks, because I think us AI researchers like to think we’re creating brains, and there are actually a lot of connections to it. But at the end of the day, you can really think of it as a model where we’re just specifying what we want it to do, and then it gives us back a program we can apply in other places.
Robyn Karnauskas:
It’s helpful. Because I think a lot of people don’t know the difference between deep learning and machine learning. They think they’re the same thing. It’s not, that you explained it very well, actually.
Joshua Meier:
Yeah. I mean, machine learning, you’re also specifying these inputs and outputs and learning the program. I think where deep learning starts to diverge is โฆ People throw around these terms and at the end of the day, we end up using them interchangeably. But in deep learning, it’s all about making really deep model. So getting as powerful a model as you can get where we spend a lot of our time, for example, at Absci thinking about these models, like how do you represent the data appropriately? So, let’s take a 3D structure, for example, you and I know that a structure that is if you flip a structure, it’s the same protein, there’s no canonical orientation of a protein. You can rotate proteins around at the same thing. How do you teach a neural network to understand that? That’s something where we spend a lot of time working through new algorithms that can represent that. And that’s where we’ve seen a lot of the progress in this field in the past year or so.
Sean McClain:
So, we’re like a data centric company. Again, we’re not able to do what we’re doing going fully in, so if we didn’t have the data, that throughput and the quality. But talk to us a little bit, Joshua, about the actual AI and model side. How much of what we’re doing is actually cutting edge on the model side? And what’s the importance between data and creating brand new models? And how does each of those contribute to the overall problem we’re solving?
Joshua Meier:
Yeah. So, going back the point before about being at the frontier of data models and compute, as more data comes online, you need more powerful models to be able to leverage that data. So, to be in a healthy state, you really need to be pushing all three of those. So we’ve made major investments here in our data pipelines, of course, but we also need to โฆ We hire the top AI scientists out there to keep pushing the modeling front, and then we provide those researchers with all the GPUs and compute in order to push these forward. So, you really have to be at that frontier of all three. And that means that the modeling that we do here, it’s beyond the cutting edge.
Joshua Meier:
We are kind of creating that next generation of models here because the models that are even out there today are just insufficient for the kind of scale of data that we’re creating here. And as an AI researcher, I mean, that’s just an incredible place to be in where as you improve the models further, you really see real world results because we have the data and the compute to power it. So, the modeling that we do here, I think is unique from a lot of the other AI companies in this space, or companies applying AI, where we’re really doing really novel stuff with the models here. If you just look at the state of the art today, it’s just insufficient where we want to go and for modeling the data that we have.
Joshua Meier:
So we really have big investment in the modeling here as well. And I think that’s something that โฆ I mean, just working on this firsthand, it’s really difficult to do. And you talk to other folks in this field and we show them some of these results and they’re blown away because many folks have tried to do this before and haven’t gotten it to work. And it’s really satisfying to be able to blend those models with the data here and actually see things working in practice.
Sean McClain:
Yeah. So, I’m going to put-
Robyn Karnauskas:
A lot of other companies, when I talk to them have to have a crystal structure in place than to use machine learning to figure out how to build a molecule. Right. So that’s, and that’s a lot, they’re limited by the amount of crystal structures that are available and what they can go after. So, I think it’s something important that people need to know from a big picture point of view that some companies are limited in how they can use their AI?
Joshua Meier:
Yeah, exactly. So in order to โฆ It goes back to that point about the data, the more data that you have, the models become even more interesting. That’s why if you look at fields like natural language processing or computer vision, you have algorithms now that can write, they can write language, like write English and it’s almost sometimes indistinguishable from a person, and you can do that because there’s so much data available on the worldwide web. So, you just go and you scrape the internet, and there’s a lot of texts there. So really able to focus on the modeling right now. And what we’ve done at Absci is really bring the protein design space into a similar regime as well, where the scale of data that we have here allows us really focus on building those models.
Joshua Meier:
And people have been amazed to see models writing English like a person, having models making protein like a person is actually not that good. We want to do a lot better than people, but it goes back to that point where AI is hard to get working on the easy stuff, but a lot easier to get working on that harder stuff. So, that’s why going back to the question I asked Sean and Matthew before, it’s like what is a biotech company doing with an AI person? There’s actually so much opportunity in this space. And it’s really Greenfield right now, and it’s really exciting being part of the front runner in this space, the company with the biggest concerted effort here bringing all the pieces together. And we really do have all those ingredients to bring this technology to fruition.
Sean McClain:
So, Joshua, that’s great, but I’m going to put on my investor hat here for a second. We talk about being on the bleeding edge, where the first AI drug discovery for protein based therapies to go public, there’s very few companies in this space, but where are we really at in terms of creating the bleeding edge? What are some actually concrete examples of how we are the state of the art and what does the state of the art kind of look like previous to Absci? Matthew or Joshua, you want to take that?
Matthew Weinstock:
Yeah. I’d love to hear Joshua’s opinion and then I’ll share my thoughts at the end.
Joshua Meier:
Yeah. So, measuring the state of the art is really important for progress in science. You need to understand where you’re at and where you’re headed. This is one of the reasons why we’ve seen so much progress in computer vision and natural language processing. You take computer vision as an example, there is a really popular benchmark set that virtually everyone uses it in computer vision research, it’s called ImageNet. It’s a bunch of images, this is a cat, this is a dog, and the model has to predict which one’s a cat, which is a dog. And just having a set like that, you can go and tell someone what your algorithm is. What’s the state of the art. I got 95% accuracy on ImageNet.
Joshua Meier:
And that makes a lot of sense to everyone’s using the same benchmark. We don’t really have that in the protein design space today. We have a little bit of that in protein structure prediction. That’s one reason why there’s been a lot of progress on proteins building. There’s a competition every two years where they take a set of proteins and it’s like, “Here are the sequences, can you predict the 3D structure?” And they measure people’s accuracy. That’s why we’ve seen a lot of progress in protein design [inaudible 00:44:03], benchmarking-
Robyn Karnauskas:
Have you won that competition?
Joshua Meier:
Well, this is [inaudible 00:44:08].
Robyn Karnauskas:
Where are you in the ranking order of the ability to predict the protein structure? Sean, probably really needs to know.
Joshua Meier:
Well, Robyn, I mean, just for where we’re going, we’re going beyond that. We’re not just interested in predicting, getting good accuracy on predicting 3D structures, we want to design those 3D structures. And that’s why we’re really creating the next generation of benchmarks here. And to be able to do that, again, you need to have the data, you need to be able to figure out what are the benchmarks that you care about. We have complete freedom to do that. We are not looking at some data set that someone published five years ago, we create new data sets every hour. And we can change that benchmark set depending on what problem we’re trying to solve. So by having that data and being really clear about what problem we’re trying to solve, we can make a lot of progress very quickly.
Joshua Meier:
So to answer Sean’s question, where are we at? Well, I mean, there’s a couple of really key problems here. So for example, in the drug discovery space, we have models that are nearly as accurate as experimental methods. So you take high throughput experiments in the lab and we’ve been able to train AI models that are almost as good as that. So, you train the model on some of that data and then you don’t even need to run the experiment anymore. You just do it on the computer, and it’s pretty mind blowing. When you think about accuracy, it’s like are we 80% accurate in predicting these results or 70% accurate? You’re almost as accurate as some of the assays and it’s almost unfathomable that you can do that.
Joshua Meier:
But the thing is like biology has noise as well. So even if you have some error with your AI algorithms, it’s possible to go beyond what you can do with some of the assets. We’ve had some really cool advancements on the bio-manufacturing front as well. We’ve had models that have discovered accessory proteins that have doubled our manufacturing yield. So, essentially cutting the cost in half of producing drugs. Really exciting advancements on, on these problems in the drug discovery and bio-manufacturing space.
Matthew Weinstock:
Yeah. That was very well said, Joshua. I think in-
Joshua Meier:
Thank you, Matthew.
Matthew Weinstock:
โฆ the interest of time, I don’t have anything to add to that.
Robyn Karnauskas:
Maybe I should ask Matthew a question about-
Sean McClain:
But I will say though, Joshua, you did sell us short. We did actually show our AI models. We’re able to get the highest binder of trastuzumab when compared to any literature. So, I would say that our AI models are actually doing better than experimental results, which I think is really exciting.
Robyn Karnauskas:
That’s a good real world example. Matthew, let me ask you a question so you get some airtime, and we’ll talk about outside too in the interest of time. So, you’re one of the few players that are going after full length monoclonal antibodies, but Absci is also going after next generation biologics. So can you help me understand what the next generation biologics are and why are so few companies pursuing them? You’re the only one. And is it harder to work on? I assume with the talent that you have in front of us that you guys can solve any problems. I’m assuming you just went after the hardest problem to solve and you took it. So, explain a little bit more about next generation biologics as it relates to Absci.
Matthew Weinstock:
Yeah. So, let’s start by talking about monoclonal antibodies first. Monoclonal antibodies, this is the first instances of approved monoclonal antibodies. They come on the market in the 1980s and it’s a revolution, and we continue to see monoclonal antibodies get approved in larger numbers every year by the FDA. And what an antibody is is these are actually molecules that your body makes. You have a certain type of cells in your body that make these molecules called antibodies, and you can think about these antibodies as really the programmable molecules that you can reprogram to interact with different proteins. So, if you get the cold or you get some kind of disease or even cancers, your body can program antibodies to interact with that diseased tissue or with the virus to eliminate it from the body.
Matthew Weinstock:
So it’s like this ultimately programmable molecule that you can target to just about anything you want. And the advancement that happens in the 1980s is that we, as humans figured out how to program antibodies and start developing our own antibodies in test tubes to go after cancers, or you name the diseases that we’re going after. There’s hundreds of approved antibodies on the market today. And there’s a whole industry that’s been built around discovering, developing and manufacturing antibodies, and it’s great because it’s improved the lives of people around the world. But antibodies, they’re not good at everything. So, you might imagine that maybe you want a molecule that you can attach a chemical warhead to improve its potency, to make it even more effective at killing cancer, or maybe you want to target multiple molecules simultaneously, or maybe you want something smaller than an antibody that can actually penetrate into solid tumors, which antibodies are not very good at doing.
Matthew Weinstock:
So you have emergence of these so-called next generation biologics. These are protein based biologics that are going beyond antibodies. So, they’re antibody-like in the sense that they’re programmable and you can target them to certain molecules, but they’re not in antibody scaffold. They’re different types of proteins, some smaller, different shapes, different functions. And the issue in the industry is that, again, there’s this whole turnkey industry around discovering and manufacturing antibodies, and these next generation scaffolds don’t fit into those discovery tools or those manufacturing tools. And so just out of inertia, the industry largely kind of stays away from them, except for those intrepid companies that realize the potential of these scaffolds and are going after them. Our technology is super flexible. We can do monoclonal antibodies and we can do these next generation scaffolds, which is really where we’re putting our focus because that’s a space that we believe in and we see it’s an unmet need. There’s nobody really in the industry playing in that space.
Robyn Karnauskas:
Got it. And Sean, for you, I know you did a great topic talk at JP Morgan. Everyone should listen to it. And we’ve been all talking about AI mainly, but there’s a lot of focus from investors on your cell line platform. Can you just describe that briefly?
Sean McClain:
Yeah. So the beautiful part about our platform is not only do we screen for the functionality, but we also screen for the tighter. And so when we discover the molecule, we’re actually discovering the cell line and the components of the cell line that are important for producing at high titer in quality. So the same exact technology that would discover the drug candidate along with the cell line can actually be used to make cell lines as well. So, partners have come to us and we’re like, “We can’t progress this drug to the clinic because we can’t produce that high enough titer in quality. Well, we can basically fix the amino acid sequence of the protein and screen for the various folding and expression solutions that are important for making that protein at the high titer in quality.
Sean McClain:
Again, that’s a beautiful part about our platform is we can integrate at different stages, we can integrate at cell line development or we can integrate at the drug discovery and cell line development or go far upstream to target discovery. And so that really gets to our big vision, which is being able to go fully in silico from patient samples being able to predict the target that’s for that particular individual really making personalized medicine a reality, and then designing the antibody along with the cell line all fully in silico, literally at a click of a button, all on the computer. And that’s really where the future is headed. And we will be the first to see that happen for protein based therapies.
Robyn Karnauskas:
That’s great. So, I guess I learned a lot, I guess we’re coming to a close top of the hour. But I learned a ton. I learned that I’m glad I left my PhD program because the scientists are being wiped out, so I appreciate that that I’m an analyst at a lovely bank called Truist. And I’ve learned โฆ I feel like I think just the basics of understanding deep learning and the manufacturing points that you made were really interesting to me. I did not appreciate how AI speed up manufacturing reduce the cost. And that’s something that I learned that was new today as well. So maybe I’ll just ask Sean, is there anything you’d like to say in closing? And what do you want people to take away from this podcast workshop that we did today?
Sean McClain:
Yeah, definitely. I would really love for folk to take away a few things, that data is super important when it comes to AI and deep learning. It’s not only the data itself, but it’s the quality of data, the throughput of data, and being able to rapidly iterate on models. And that’s what our platform and our SymBio platform really allows for the last we’ve spent the last 10 years developing this platform to generate the data on the protein functionality manufacturability to really feed into our AI models, to really see our vision through, of going fully in silico.
Sean McClain:
And I hope that everyone got today really a feeling and really seeing where the industry is headed, which is really this merging of biotech and tech, AI and biology. This is the future, and it’s not only the future for drug discovery, but it’s the future for this industry. And I couldn’t be more excited to be collaborating with our partners, our team. And everything that we’re doing here wouldn’t be possible without our team. And I’d also too, just want to thank you, Robyn, for hosting us today. This has been a ton of fun. I will put one last plug out there. Matthew, do you want to talk about the podcast that we’re actually starting?
Matthew Weinstock:
Yeah. So, Sean and I are in the process of launching a podcast where we’re going to be exploring topics like this and other exciting topics, convergence of technology and biotech and innovation. So, I guess, keep your eyes open for that. It should be dropping soon.
Sean McClain:
Yeah. And one last plug, if any AI talent is listing here, we are hiring, we are looking for the absolute best talent both on AI and the wet lab side. It’s absolutely critical for our growth, so please apply. We have extraordinary things we’re working on.
Robyn Karnauskas:
[inaudible 00:55:34].
Matthew Weinstock:
Yeah. Come change one protein at a time with Absci.
Sean McClain:
One protein at a time in the world.
Robyn Karnauskas:
I learned a lot today. I had so much fun too. You guys are great to chat with. So, I really appreciate you making my lunch time enjoyable. Thanks a lot, Matthew, Sean, Joshua. I think this really will be beneficial for a lot of people and it’s great to hear more about your company. Thank you, guys.
Matthew Weinstock:
Thank you, Robyn.
Sean McClain:
Thank you so much for having us, Robyn.
Joshua Meier:
Thanks, Robyn.
Robyn Karnauskas:
You’re welcome. Take care, everyone.
Sean McClain:
Bye.
Matthew Weinstock:
Bye.
Joshua Meier:
Bye.