59. Algorithm-Based Investing, Part 1 (Andrew Parker)

The Full Ratchet Podcast on iTunesNick Moran Angel List

Andrew Parker of Spark Capital joins Nick to cover Algorithms as a Competitive Advantage. We will address questions including:

  • Andrew Parker Data AlgoirthmsToday we are talking data and algorithm-based startups. You’ve written a great deal about startups using algorithms as the basis of their value. When and why did you begin investing in startups w/ a focus on algorithms?
  • Can we start off with an example of a business or two that are using algorithms as their competitive advantage and why the algorithm provides this advantage?
  • Is there an example startup that you could highlight where the common perception is that the algorithm is the key to their business but, in fact, that’s not the case?
  • You’ve written about technical entrepreneurs who think in code and use elegant algorithms to solve complex problems. How do you go about evaluating the technical expertise of an entrepreneur or founder?
  • Does the stack or coding language that an entrepreneur is using affect your opinion? Do you treat older language coders the same as new and more responsive language coders?
  • Those startups that are using algorithms to solve complex problems can gain a strong first-to-market advantage. How does one assess the value of an algorithm at a very early stage?
  • Do you look for algorithmic parallels or analogies across different industries to see how smart algorithmic solutions could be applied to similar problems in a different context.
  • Data can reveal problems that consumers may not have even realized exist. Can you talk about how you assess whether the problem being solved has real demand in the market?
  • With regard to startups using data & algorithms as fundamental value; how should entrepreneurs think about and protect their IP and how might a VC firm assist?

Itunes:  http://apple.co/1mCBKnH

Direct-audio:  http://bit.ly/1n4iob0

SoundCloud:  http://bit.ly/1kM989u

Guest Links:

*Please excuse any errors in the below transcript

Today we welcome Andrew Parker from Boston. Andrew is a former technical programmer and designer turned investor. He began his VC career at Union Square Ventures and is now a general partner for Spark Capital. I first came across Andrew through his blog at thegongshow.tumblr.com . And aside from his great writing, one of my favorite things about the blog is that he includes his investment thesis for each deal he’s led.  If you want some insight into how VC’s make decisions, I think it’s one of the better blogs out there. With that said, Andrew, welcome to the program and thanks so much for your time today.


Andrew Parker: Oh, well, hey thank you so much for having me. It’s a pleasure to have the opportunity to speak to your listeners and thanks for the really generous introduction there, for the blog I’ll get a couple more readers out there.


Nick Moran: Well, thanks for writing it.


Nick Moran : So before we jump into the topic, can you start us out with your background and how you became involved in startup investing?

Andrew Parker: Sure, yes. So I kind of have two stories in my background. One is a professional story that I can walk through which, you know, makes some kind of an arc about why I made which career moves when I did. But the real story is that I really have just been chasing a woman around the country for about a decade now. So, I first my wife, Lisa, at undergrad at Stanford. And she was studying biology there, I was studying symbolic systems, kind of a mix of computer science and linguistics and philosophy and education. And it kinda, you get a lot of breadth but not too much depth. And so my first job was in the Bay area at a hosting company called Homestead. They had upselling into it a year later. I was doing like product design and product management for them. But I largely chose that because you know, my wife wanted to hang around the Bay area, she’d gotten a job at a local biotech and so was like yeah let’s let’s do this, let’s hang out together and  that was great. But then she applied to medical school across the country. She got into Columbia P&S in New York. So I was like going to New York. Specialization around computer vision that I just thought was going to be super interesting.  And, you know, I was in management before . My major had a lot of breadth but not much depth. So I thought that a graduate degree would be a really good way to get that depth that I felt like I was lacking. And I was two weeks away from enrollment, and I just kind of fell over backwards on the opportunity to work with Fred Wilson and Brad Burnham at Union Square Ventures. And I was the second analyst they had ever had at that program. I think you interviewed the first one they ever had previously on this show, Charlie O’Donnell.


Nick Moran: Yeah that’s right


Andrew Parker: And yeah so I followed him and it was an amazing experience. I mean, I knew a little bit about venture capital because my father was a venture capitalist when I was growing up. And so I got kind of like a kitchen table view. He always worked in a very different market. He was doing more specialty materials and chemicals and some biotech and stuff like that. Whereas I don’t do any of that stuff. But I had a sense of what the job was like. And so then to actually, you know, do it on the front lines I just thought it was great. And so, I drank from the fire hose at USV for 4 years. And then my wife graduated medical school and got a residency up in the partners program between MGH and pregnant women, doing OBGYN. And so I was like alright, Boston, like here we come! And I called up the guys at Spark and I said hey I’m moving to Boston, really like to work out of the same city where I’m living and so I’m going to leave Union Square and what do you think about working together. And it was a really easy conversation. I already knew about half the partners there, but had to meet the other half over the course of the week. And then they made me an offer at the end of the week, and it was great. And it’s really, really worked out well for me.


Nick Moran: So cool. We’d love to pick your brain sometime on how Fred and Brad sort of make decisions and how they size things up. But today we’re talking data and algorithm based startups. So you’ve written a great deal about startups using algorithm as the basis for their value. When and why did you begin investing in startups with a focus on algorithms?


Andrew Parker: Well, you know, I think that’s always been something of interest to me. I liked the sexiness of this idea in IT investing that while you’re sleeping you’re making money. You know, I mean like there’s this constant machine that’s going on in the background or whatever. And so like I think like anyone who’s interested in this space at some basic level shares that common interest in the idea that there is the machine that’s been programmed to do something and repeat a million times, and you found some marginal value in doing that. But I think, you know, the interest is really honed in as I described earlier, drinking from the fire hose at Union Square Ventures. You know, data as a form of a network of facts. You know, some way in which as a startup gets more data, the value in that startup starts to compound on itself, the way you process that data creates even more value than the sum of the data itself. That was, that’s certainly a piece or at least a derivative of, you know, what they had focused on a lot in Union Square. And that’s where I honed my investment lens, was learning from Fred and Brad and later Albert there. And so, that’s a piece of it. And then the other piece I would say is when Coursera first launched, there was only three classes. There was an artificial intelligence class, a database class and a machine learning class. And I found this model for learning online really just like wildly novel. And so I signed up right away specifically for the machine learning class. It was taught by a Stanford professor that I didn’t take his class while I was at Stanford but I had seen guest lecture a few times and I knew I was going to be in for something great. It’s Andrew Ng’s machine learning course on Coursera. It still exists. You could take it tomorrow. There, you know, I really got an appreciation for, you know, lot of the machine learning algorithms. There’s a whole bunch of different flavors that you cover in the course of the —loop 6:02—  that getting hands on with that gave me a much finer appreciation for just how much leverage there is in trying to build a business on top of one of these algorithms. It’d really make it work for a value proposition in a startup.


Nick Moran: Yeah, we recently had Tomasz Tunguz on the program and got to talk a little bit about machine learning. But the whole area is just so fascinating. And,


Andrew Parker: Yeah, I love Tomasz’s blog man. He does a great job. I haven’t met him in person but I’m an admirer from afar.


Nick Moran: Yeah, great guy, very authentic. Really appreciated him coming on. But, Andrew, can we start off with an example of a business or two that are using algorithms as their competitive advantage and why the algorithm provides that advantage?


Andrew Parker: Yeah, sure. It’s rare you see a company built on solely one algorithm alone. Instead, you know, you’ll see companies that compose their product, their value proposition, through a combination of both algorithms and then usually some kind of, you know, say human filtering or design or other touch or whatever. But there are definitely corners of particular products that largely sing well because of the algorithm that underlies them. So, you know, you’ve seen a huge rise in startups that pitch themselves as uberfracts (7:19). And I think part of what they’re describing there is you hit a button on your phone and something magically comes to you. And behind that simplicity is usually some kind of logistic space algorithm, routing algorithm, some way in which they’re matching supply and demand in real time. You know, in the case of our own portfolio company Postmates , that’s certainly true. You know, when you’re asking for Postmate to say bring you food from a local restaurant or something like that. Like there’s a lot of heavy listing going on in the back end in order to match supply and demand efficiently. Think about root optimization. There’s some classic computer science problems that are being tackled in here, like a traveling salesman problem or just other kind of logistical issues. So, you know, you’ll see this kind of thing crop up in startups all across our portfolio one way or another. I’d go into other examples of other algorithms that solves other problems but I don’t want to dig too far into this question though. I’ll follow your direction.


Nick Moran: Okay. How about the other side of the coin? Could you give us an example of a startup where the common perception may be that they’re using algorithms as the key value for their business but in fact it’s not the case?
Andrew Parker: Yeah, I mean this is definitely a thing that occurs. I think it’s a really interesting question. You know, for example we made an investment in a company called IEX. And now I don’t want to use IEX as an example to say that there’s no algorithms there. In fact, they’re probably one of our most algorithmically intensive portfolio companies. But there was one particular problem that they solved that I thought was super interesting. And maybe some of your listeners have read about this in the book Flash Boys by Michael Lewis. So IEX was the featured company in that book.


Nick Moran: Sure, yeah


Andrew Parker: The purpose of the product, the purpose of the company is to try and help large block traders, large institutional public equity holders get the equity for their holdings without being front run by high frequency trading algorithms. And so you see this constant cat and mouse game evolved during the course of the book. And I found one technique of combatting the high frequency traders super interesting because you’d think it would be algorithmically intensive but instead it’s actually done with simple basic physics, where they are willing as an exchange to connect to counter parties computers. People who were going to trade in their network and provide the equity for their exchange. But many of those people are high frequency traders, and so rather than trying to identify those people a priority, instead they just take away their advantage by running fifty miles worth of fiber optic cable inside of a small little plastic box, which is basically like the size of a bread box. And that box is a thing that exists, you can buy in retail, you could buy it on Amazon because it’s used for diffusion testing of long distance fiber optic network transfer for these, you know, kinda legacy fiber optic infrastructure companies. They use it specifically for testing purposes. And so I just bought a bunch of these boxes and stuck them directly in between the connection between their exchange and the counter party’s computer, and they now took away the high speed advantage that’s necessary for all these high frequency traders out there that must work, by just simply running the order class over fifty miles worth the distance, even though the computers are only like, you know, ten feet apart. So I felt that was super clever. You’d think that an algorithm would be able to solve this problem better, but in fact it’s just physics.


Nick Moran: You know, sometimes the most simple elegant solutions are better than the overly complex, overly designed solutions that you often read about or hear about.


Andrew Parker: I totally agree with that. Yeah, and there is like a outcomes raiser that kinda emerges there where if you can find the simplest solution let’s just go with it.


Nick Moran: Yeah, I coded in college in eight different languages, and I think my code was always three or four times as long as some of the better coders in the class. So I was the least elegant but if I could get to the answer then I was happy.


Andrew Parker: Yeah, yeah. I remember the best coders, there was certainly a brevity to their code. But then there was also a readability that was so impressive, where you know just simply by reading function names and variable names you knew exactly what was going on even if you couldn’t follow the logic. And that really separated, you know, some of the best programmers in my undergrad program.


Nick Moran: Good point. You often have to debug other people codes. So if it’s readable and indexed, it’s a lot easier.

So Andrew, you’ve written about technical entrepreneurs who think in code and use elegant algorithms to solve complex problems. How do you go about evaluating the technical expertise of an entrepreneur or a founder?


Andrew Parker: Well I think to a certain degree you can treat a technical founder like a black box. And just see what comes out, right. So you can do this I think in a couple different ways. One is seeing just the raw product that comes out of that box. Now you might not be analyzing an algorithm alone, right. Instead you might be analyzing an algorithm plus design plus some speed or latency in terms of, you know, the network efficiency for what they’ve built or something like that. But if there really is value in the algorithm that’s just in the core of what they’re doing, that should be transparent in the product execution. And if it isn’t then that’s an adverse signal that you can look for. I think that the other way you can treat a technical founder like a black box kind of see what comes out is through a thought leadership and code. And I think this is a concept that has really risen to prominence over the course of the past decade, and was true prior to then but less so. So like the primary web presence for a really strong technical founder is usually going to be GitHub , where you can see the frequency with which they’re checking in code, how starred or forked any given project that they’ve uploaded there is. You can definitely get a sense of like the heat, the activity from a GitHub profile in a way that, you know, before GitHub existed that would be pretty hard to wrap your arms around. And then if you think even further back, you know, the 80s kinda mini computer era, something like that, there, you know, IP was treated very differently where people were super productive, constantly trying to lock down their IP. The idea that you’d incorporate, you know, open source technology into your company was heresy. And so like we’ve just come such a long way from there. That you can just look at what’s happening in this open source community and that says a lot about a technical founder.


Nick Moran: Do you look at and care whether their full stack coder running back end and/or do you , do you care about the languages? So if they’re using some of the more antiquated stacks out there as opposed to maybe some of the more modern and responsive stacks that have come out more recently?


Andrew Parker: Yeah, I don’t care about the stack or the language or any of that stuff. So long as, you know, they’ve made an intelligent choice about using the best tool for the job. And I think the best tool comes down to a couple different things. One is, Is the tool actually well suited? Like for example, Twitter was originally built on Ruby on Rails, and that was just a mistake. You know, they had to rebuild a lot of the back end code there to be on sea because they built what was essentially a messaging app on top of a platform that was really designed from the ground up to be much more round like a C in mass framework plus plus. So that’s one thing. Two, I think their framework decision or language decision will inform their ability to recruit. You know, like if you’re writing whatever you’re doing in GIS, that is like cocaine to being able to recruit certain types of developers which can be really helpful. But then again, you know, if you wrote it in OCaml, that’s cocaine to another set of developers.


Nick Moran: Right


Andrew Parker: Might be a smaller community but bit might also be more die hard or something like that. So like, you know, there’s some trade offs, some of them are pretty perceptive when you’re choosing a particular language and framework. And then the last thing I’d say about this is that the best computer scientists are really strongly rooted in programming paradigms that transcend all languages. And so if they’re really great, you know, no matter what language they’ve learned before, they should be able pick up the basics in a language within two weeks and then start to really wrap their arms around the idiosyncratic corner cases in a language within two months. And so the time to ramp up on a new language is not trivial but it’s small enough that you’re really just gonna end up hiring that sort of person at the end of the day.


Nick Moran: So Andrew, those startups that are using algorithms to solve complex problems can gain a strong first to market advantage. How does one assess the value of an algorithm at a very early stage?


Andrew Parker: Yeah, I like that question because if an algorithm is built in a way where you know the value of it increases, hopefully compoundingly, as the data that’s fed into it increases, at the earlier stages it’s really hard to know. You know, like,


Nick Moran: Right


Andrew Parker: You know, is the, only not the core of this company viable or not. You know, so I think there is something to just product quality. You know, understanding when you actually use the product, do you see value already and then can you use your imagination to explain just all of that as more data is fed in through this exhaust kind of a data loop. You know, will the value proposition get better. And it might mean that you might have like a  false negative on, you know, particular ideas or algorithms. Like it might be really hard to understand in the early stages that google search algorithm is highly differentiated from other search algorithms because they don’t yet have a whole bunch of data exhaust from humans clicking somewhere in the space of one to ten blue links, that actually provides the input that, you know, the ranking for that given search engine result page was good or not. You know what I mean


Nick Moran: Right


Andrew Parker: So like I think there is this feedback loop that you have to rely on. And maybe you’ll end up with a false negative, but I think that’s the best you can do, right, is just kinda treat the algorithm like an object in the world and use it and show it to end consumers, obviously wrapped around you know good UI design, and see if it creates value.


Nick Moran: Yes, speaking of Google, they’ve famously sort of designed their search engine rankings around references and citations that the journal and publishing industry had used for many years. Do you look for sort of analogies across industries from an algorithmic standpoint when you’re assessing a startup?


Andrew Parker: That’s super interesting. I think there is some signal in there that you could probably look for that. Definitely when companies I’m working with are trying to tackle a particular problem, such as say anti-spam issues or trying to surface recommended content , both of those are use cases that are really best solved through algorithmic innovation. But, you know, the algorithms to solve those problems are mostly known. Now they have to be really well tuned specific to the variables in a given company, but, you know, they are publicly known algorithms that you can get as a part of algorithmic framework. Like, you know, what Google recently open sourced in, in TenserFlow or what, you know Facebook releasing Torch or what not. And so I do think there is ways in which algorithms can kind of transfer cross industry. Typically, you know, solving the same problems. I had not really thought about trying to proactively look for companies that have borrowed from another industry to solve a problem in another. I’m not quite sure what my leading indicator would be to see that before I’ve met the team. But once I’m talking to the team, definitely. You know, if I see the pattern recognition of someone using an algorithm to solve a completely novel problem and do it well, like that’s a great sign.


Nick Moran: So data can reveal problems that consumers may not have even realized exist. Can you talk about how you assess whether the problem being solved has real demand in the market.


Andrew Parker: Well, I think you’re talking about an issue that is different if you’re building a company that is creating a new market, versus if you’re talking about a company that is trying to take a share of an existing market that already exists.


Nick Moran: Yep


Andrew Parker: Right, like a lot of people describe, you know, in the case of Airbnb , the market for, you know, sleeping on someone else’s couch was nothing, right. I mean it was like a free hand up. There was the website couchsurfing.com which was a free community beforehand, and was certainly an inspiration, my guess, to the Airbnb team. But it wasn’t clear that there was this massive market around people profitably sharing their home with other people.


Nick Moran: Right


Andrew Parker:  And so, I think for those problems where you’re creating a new market, it’s  not totally clear how data is going to help you, right. There just, there has to be some imagination, some faith. I think it really requires just a compelling founding vision and a great founder to be able to evangelize that vision. And then an audience that believes in that vision. That audience might be a VC, but that audience might also be the founding employees, right, the first few people that are persuaded that yeah we’re going to build something completely new and in a direction that’s just pioneering. So I’m not quite sure how you’ll use data in that context, but you know certainly the founder’s just tenacity will carry you through there.


Nick Moran: With regard to startups using data and algorithms as the fundamental value, how can entrepreneurs protect their IP and how might a VC firm assist?


Andrew Parker: Yeah, so, you know, IP protection in startups is something that is a bit controversial. I think patents today, particularly software patents, which are often, often ineffective (23:22)   as business method patents. They are hard to protects, they are harder to get issued. I wish it was even harder because, you know, at it’s core when you’re trying to use patent to protect an algorithm, you’re basically saying hey everyone there is some math that exists in the world that I discovered that you can’t use. And it feels so weird when you think about like math as, you know, some kind of pattern recognition on nature or a framework of thinking about the world or something like that. That just kind of explains what already exists out there in terms of like pattern and numbers or whatever. And to then say that, that’s my pattern. You’re not allowed to use it like that. I guess I find that kind of uncomfortable from an intellectual perspective. And so instead I really encourage companies to try and publish what they’re doing publicly, either open source or otherwise. You don’t have to publish everything. But certainly if you feel like you have some real algorithmic innovation, I find that, you know, the more you give away the more you get back. And so by publishing it open source,  other people are going to contribute to it, it’s going to make your idea better faster. And then find ways to build your business using data as your proprietary advantage, where as the open source community is helping improve whatever algorithmic advantage you had, you’re then compounding your lead through data. I think this is exactly what Google did when they just open sourced their machine learning framework TenserFlow. They are basically saying like hey we had some really smart engineers focus on building great machine learning algorithms and a framework for running those algorithms for years. And so now let’s push that out in the open so that everyone together can now accelerate the pace at which we’re innovating on this framework. But Google’s not sharing any of the trainings set data, right. They’re not showing their, sharing their millions of images inside of google photos, against which they run this algorithm, or they’re not sharing, you know, the corpus of search history. You know, all the ten blue links that got served up to people on which link they clicked on. And so as the open source community is helping improve their algorithms, Google has still really protected themselves for a long time through a data head start.


Nick Moran: Yep. Yeah, we recently had Leo Polovets on the program and he talked about creating Data Moats and how companies like in Netflix have published the algorithm that they used for recommendations. But they still hold so much data that no one has access to. So they can maintain their competitive advantage and their source of defensibility.


Andrew Parker: I love the Netflix example too because they not only have they been public in terms of you know what is their recommendation algorithm and data over time, but they even solicited it publicly, right. They ran the million dollar Netflix price. This was roughly maybe you know 4 or 5 years ago or something like that. I think it might be a little bit longer. But they served up their data set anonymized of you know which users had liked which movies, and then asked people to figure out which movies are they more likely to like in the future. You know, what should we recommend to them, and there were teams of people mostly academics focussed on trying to solve this problem for them in exchange for the found leadership of being the winning team and also in exchange for a million dollar prize at the end. It’s just unbelievable.


Nick Moran: It’s so cool