49. The Value of Data, Part 2 (Leo Polovets)

The Full Ratchet Podcast on iTunesNick Moran Angel List

Leo Polovets of Susa Ventures joins Nick to cover The Value of Data, Part Two. We will address questions including:

  • Polovets Value of Data part 2Transitioning to business models and the way data can be wielded to generate value. Can you provide an example of a business that relies on data within their business model to create this symbiotic, accretive value generation?
  • You’ve cited three ways in which data can be monetized. Can you highlight each?
  • I want to touch on HOW companies can put this into practice… including content companies, ecommerce, data providers, and tools. Let’s start out w/ content companies. How might an early-stage company in the Content space put together a data game-plan?
  • How about an ecommerce company? What’s an example of an ecommerce startup that has executed a strong monetization game plan?
  • How about data providers? What’s their model for generating revenue and who are their customers?
  • Finally, on the B2B or B2C tools-side, can you talk through an exmaple, let’s say a B2B SaaS company that effectively monetizes data?
  • How should data-centric companies think about their pricing models and can the pricing model change for the same product that has different value to different types of customers?
  • Taking a step back, as an early-stage investor, what advice would have when thinking about data and investing in companies that are either focusing on data or have an element of their strategy where valuable data is being acquired?
  • As a startup founder that projects to have a business where the data is the competitive advantage and creates the key value…. what thoughts come to mind in terms of crossing the data chasm in order to get the necessary traction early without a data advantage in order to reach a point where a critical mass of data exists to realize that competitive advantage?

Itunes:  http://apple.co/1QEDvfO

Direct-audio:  http://bit.ly/1NM0EqE

SoundCloud:  http://bit.ly/1lHhuAn

Guest Links:

Key Takeaways:
For today’s takeaways, I wanted to review the three major sections and their sub-components within the value of data series.
1- The Importance of Data
Here we discussed the historical reasons why data is becoming more important and also the ways data has become a strong, defensible, sustainable competitive advantage.  Older forms of competitive advantages, like software and hardware, are no longer as defensible.
What are the types of competitive advantage?  Leo mentioned:
  • Recommendations… where he gave the example of Yelp reviews
  • Improved Efficiency… where Leo mentioned Uber and their ability to optimize their fleet of vehicles to serve the demand in different locations at different times
  • General Predictions and Modeling… where he talked about LendUp where they can make better predictions about what people can and can not pay as opposed to using an older, innaccurate method such as their credit score
2- Data Collection
This section related to collection methods and tips.  Here we reviewed how to tell if the data can be valuable and how to collect it.
4 attributes of data
1. It’s hard to build.  Easy to acquire data sets will have less value because they’re not proprietary.
2. It’s clean, accurate and up-to-date.  Bad data and old data does not just have no value, it actually has negative value if it is being used to make decisions.
3. The data is useful.  In this portion Leo compared purchase history and it’s tremendous predictive value vs. data on shoe size, which may have very limited value.
4. The size of the data set.  This relates to sample size and statistical significance but even beyond that large-sized data sets are not only more relevant but can be used in many more ways.
5 Major sources of data
1. Direct collection… Asking users for feedback
2. Crowdsourcing… So, where direct collection is often an outbound request, this source typically occurs inbound when a user chooses to contribute, unsolicited.
3. Paid Crowdsourcing… This is different than the previous in that the company has hired an individual or service to acquire data.  In this case the data may be publicly available but not organized the way you need it or may be scattered across many sources.
4. Data Exhaust… This is data collected during normal usage of a product that the user often doesn’t realize.  It could be as simple as clicking one link when a list of five links are presented.  As digital online networks have grown, the importance of Data Exhaust has only grown with it.
5. Combining Data Sets… This is the inter-relationship between data sets that creates insights.
And Leo closed-off this section by suggesting that startups collect as much data as early as possible.  The analysis and processing of the data is less important early-on, but merely the fact that it has been collected in a clean way will create many opportunities for a competitive advantage down the road.
3- Data Business Models
Section 3- Data-centric Business Models.  This is where we reviewed different businesses that use data at their core
Monetization Methods:
1. Selling the data directly.  If you have data that others would like access to, the data itself can be the product or service
2.  Increasing Revenue.  This is possible via better recommendations, better ad targeting.  Essentially the better you understand your customers the better you can serve them through products and services.
3.  Expanding Margin.  This has to do with optimizing pricing or optimizing the cost-side of the business.  An example here was holding more appropriate levels of inventory, which can reduce inventory cost and also increase revenue by preventing going out of stock.
Types of companies:
1. Content Companies: For these types of businesses Leo advocates A/B testing different types of content and different headlines.  He also talked about measuring which types of content may ellicit more sharing and which types may cause more engagement and time reading other related material.  He called this “instrumenting readership.”  Depending on current goals for growth or engagement, different approaches can be used based on the data.
2. eCommerce: Here he cited companies that send out boxes of products and subsequently monitor what people keep, what they send back, what types of products and product characteristics most resonate with them.
3. Data Providers: For this group, Leo talked about those that sell access to premium data, those that sell API access to raw data, and finally those that wish to augment their existing data sets with external data.
4. B2B and B2C Tools:  The final tool that Leo has written about relates to tools.  In this area, Leo’s favorite includes tools that improve efficiency through converting emails or faxes into online forms.  Approaches such as these can be used to collect large amount of organized data by streamlining data entry for users.
And Leo finished-off the discussion by reminding us that customers don’t come initially for the data advantages, b/c it takes time to build value from data.  So entrepreneurs need to first think about building a value proposition separate from the data in order to acquire customers.  It is only after a critical mass of users and activity that the business can evolve and the experience can be enhanced through the value of data.


Tip of the Week:   Death by Dendogram


*Please excuse any errors in the below transcript

Nick: Part 3 of your series can’t be fully actualized without an appreciation in understanding of the previous parts. So I’m glad we covered those. I’m glad now it’s the more fun component where we get into insights and talk about business models and the way that data can be used to really generate value. Leo, can you start us off with an example of a business that relies on data within their business model to create this symbiotic, accretive value generation?

Leo: Sure. So, I think Netflix is probably one of my favorite examples here. So obviously they have a lot of content in the form of movies and TV shows. But they also have a lot of data, both on what people watch, you know, perhaps how long they watch it, whether they abandon it in the middle, how they rate shows. And they’ve combined that to, combining, you know, your data with other data, they’ve combined this with medity 00:46 on the movies themselves. Like who are the actors in it, what’s the  ——00:50. And so as a result they have this really phenomenal recommendation in general, where they know they might like this movie because it has Tom Cruise in it and you like Tom Cruise, and it’s horror and you like horror movies. First they get better engagement through good recommendations. Second I think user satisfaction is higher, because you can see the rating of a movie, you can see the predicted rating for you watching that movie, and you can basically watch stuff that you’re almost certain to like. Rather than, you know, trying random things and then maybe half the time you kind of feel like you wasted two hours. And then I think what’s really interesting is Netflix has used this data in more creative ways. I remember reading an article about House of Cards, and Netflix basically bought the rights to that because they crunched the data and they realized that for their user base, House of Cards had exactly the right recipe to be really well liked. It had Kevin Spacey, it had a good producer and director, and this plot was really popular among a lot of users. And so they basically bought the rights to this because that’s what the data showed them. And then it obviously became a really big hit for them.


Nick: Many years ago, I was a subscriber at Netflix. And I got sick of sending the disks back. I closed my membership. And then I reopened the membership like two months later, and I didn’t even send the disk back, but I was using their recommendation engine to go down the street and then get it a movie a blockbuster, just because of that, you know, that recommendation and the rating, certainly helped me at least eliminating bad movies so that I wasn’t stopped watching nothing that wasted my time.


Leo: Yeah absolutely. I think also for me, Amazon has been similar personally, where sometimes when I see something I want to buy on another site but the site doesn’t have reviews, I’ll go check on Amazon just to see the reviews. And then the interesting thing that happens is two thirds of the time, Amazons prices is cheaper and I get it delivered in two days. And so I’ll just end up actually buying it on Amazon. So that review database is insanely valuable for them.


Nick: Bit of a curve ball here, I don’t know if you’re invested in any of the data companies and the startup and venture world, but any thoughts on the # Mattermark and the # CB Insights and even # AngelList has got to be a great data accumulation engine at this point, but any thoughts on some of the unique companies kind of in our investment space that are doing great things with data?


Leo: I actually I did invest in a company in this space called # DataFox. They’re doing pretty well and I’m really happy with them, I like those guys a lot. I think in general it’s an interesting space, because I think more data is great for VC, but I”m not a 100% sure if it’s good for predictions. It remains to be seen if it can really help with decision making. And I’ve talked to a few of the larger VC that have data science teams that have tried to basically build models to replace VCs, and it hasn’t worked very well. But on the flip side I think it’s really great for like research and discovery and you know, you’re looking at a company and you want to see what are the other related companies in the space, how long has this company been around, how fast has their social presence been growing. I think it’s really great for that kind of stuff, to basically make better informed decisions for yourself. And I think what’s interesting is I think with Mattermark and DataFox, and I’m not sure about CB Insights, but a lot of these companies are actually shifting away from investors and more towards sales people. Just because I think that’s a more natural area to show information on companies where, you know, you are looking at a lead and you see they work at some place you’ve never heard of. And, you know, now you can get a lot of data about that company, see if that maybe  you already sell it to three other competitors or a couple other clients. And it’s really useful for helping qualifying for sales, which is where some of those companies tend to be heading.


Nick: You’ve highlighted three ways in which data can be monetized. Can you touch on each?

Leo: The first way is kind of the most direct way, which is you can either sell the data directly, which is what my last company Factual did. They built up this giant database of points of interest, and then they would just sell it. Like sometimes we had data dumps, sometimes we had ——— 4:45 You can also sell the data indirectly, which I think is, you know, what these companies like DataFox and Mattermark do. Or maybe a company like LoopNet, where basically you have some shell website and the website basically just is your data formatted nicely with some links in it. And those can be pretty good businesses. The next way to monetize data is to increase revenue. So this could be with like better recommendations, better ad targeting, basically understanding your customers better, so that you can offer them more value that they would be willing to pay more for. The last way, which is kind of related to increasing revenue, is improving profit margins. So this might be using data to optimize your funnel or optimize your prices or lower your cost by optimizing your inventory levels. And again, just kind of looking at, you know, your data and having it lower your cost of business.


Nick: I want to touch on how companies can actually put this into practice , including content companies, ecommerce data providers and tools. Leo, how might an early stage company in the content space put together a data game plan?


Leo: So I’m, I don’t have a lot of direct experience in the content space, but when I look at a company like # BuzzFeed, they’ve done a really great job with using data on their end. And a lot of this comes down to heavy AV testing, figuring out what kind of content works the best, what kind of headlines work the best, what kind of topics do your users find most interesting. Also, as you get at least a little bit of sale, starting to do things like trending topics, recommendations, again things that increase engagement or for existing users also a page of trending topics that might attract people that have never been to your site before but kind of want to see what’s happening. And then I think basically as your usage volume goes up, you can start looking at what are people reading about. You know, maybe you can use that to personalize ads, decide where you want to produce more content. So I think a lot of it is basically all of that instrumenting leadership and seeing what are people reading, what are they reading together, what are they sharing with others. With each of these things, you might focus on different ones as you’re focussing on different aspects of your business. So if you want more growth, maybe you focus on topics to get more sharing. If you want more engagement, maybe you focus on topics where people read them and then keep reading other things on the same topic. And a lot of it is just about collecting data and looking at it and analyzing it.


Nick: How about an ecommerce company? What’s an example of an ecommerce startup that has executed a strong monetization game plan?


Leo: The interesting ones are, to me at least, are companies like # True&Co or # Warby Parker, where what they’re basically doing is they’re sending out these boxes of different items and then just keeping track of what did people keep, what did they know keep. And then also getting a little bit of data on the customers so that they have a good sense of this kind of customer prefers these glasses to these glasses. Whereas this other kind of customer prefers something else. And I think that’s an interesting, that’s a pretty interesting play, specially with something like True&Co. I think they basically sell women’s clothing, and they send you a box of a few items and maybe in the early days they sent you five items and you buy zero or one. But as their recommendations are better and better, maybe they send you 5 items and you keep all 5 or you keep 4 to 5. And it’s one of those interesting models where the customer acquisition cost is relatively fixed, by using data there’s so much room to make more revenue for each customer.


Nick: I’ve seen some of these companies that have created so much competitive advantage around data, starting to do either product extensions or complete business model extensions, and places they’ve never played before. Do you see a future where a lot of these big technology players that have more robust data sets than anybody else end up becoming larger and play in even more spaces?


Leo: I think so. Although I’m not sure what the future would look like. I think one of the challenges is that getting data and building 8 guys around it might be a very different business than your core business. If you’re focussed on writing content on BuzzFeed, maybe you don’t want to be building APIs to show people, I don’t know, data leadership. So, I’m not sure if companies will go in that direction of selling the data themselves. Or there might be something more akin to like, I manage the data market place where you can publish a data set, and you know this market place takes some cut, and then you can sell your data to whoever happens to want it.


Nick: So the next category was data providers. What’s their model for generating revenue and who are their customers?


Leo: So I think the one I can talk most about is the one I worked at for 4 years, which is a startup called Factual. And so what they do is they build up data sets of places around the world. And these are the kinds of things you’d see if you’re using Yelp or, you know, Foursquare or Facebook places. And so they’re basically kind of selling picks and shovels to all of these local apps and, you know, ad targeting companies that need information on given this latitude and longitude coordinate, is it in sight of a business, what businesses are there in the area, questions like that. So the customers are basically anybody that needs location data. And in Factual’s case, it’s mobile apps, like location based apps as advertisers. And at the basic level, Factual just sells the data directly, sometimes through data dumps, sometimes through APIs. And then for certain verticals they go deeper. So for advertisers, for example, they have some eight guys that just don’t show you the data but they help you do, for example, customer profiling based on what locations like a person has been in. So they have started to go kind of one level above the data, and to kind of use value added APIs.


Nick: Are the pricing models the same for their data across a variety of different customer segments and applications?


Leo: No. And I would also say that I think pricing for selling data directly is very hard, because it has different value to different people. You know, so for like a SaaS product you could sell it by seed and maybe that’s a good approximation of how much value somebody gets. For data, it can be really different. Like if somebody is using location data for checking out, maybe they only make 50 cents per customer per year. If they’re using it to figure out where to put their next McDonalds franchise, maybe that’s worth like twenty grand for that. So figuring out how to price that stuff is, it’s very you know experiment driven, it’s a little bit  hit and miss, and it’s all about just kind of trying a lot of things and learning from experience.


Nick: The very first project I ever worked on for my last company was for a large mail provider.


Leo: A mail lead?

Nick: Old snail mail. And essentially the data that they could not collect and they could not measure was the weight of an envelope. Now they could stock their line but these envelopes are moving at incredibly high speeds and they’re processing hundreds of thousands of these things, you know, every hour. And so they weren’t able to determine the weight of all these envelopes and they would estimate over 9 figures a quarter of loss. What they call revenue recovery. So revenue that they should have been getting the postage and they couldn’t get because they weren’t able to verify if somebody had correctly posted on the envelope. Now what they could do was vision systems to check what the postage was. So if it became obvious from the vision system they could kick it out but they couldn’t determine the weight. And so they brought us in. We were a motion control company with motor strides and controllers, but we were able to set up a mechanical assembly with a pinch wheel using F=MA, Force=Mass x Acceleration, to apply a force to a letter and then we would measure it’s velocity at 2 points in time and calculate the maths based on that. And my suggestion was why are we selling motors and drives out of that because we’re only going to make hundreds of thousands a year in business through this huge provider. Why not set up some sort of revenue share as our pricing model , based on the percentage of revenue recovery that they get out of this program. And I got a lot of really confused looks from my presidents and my general managers that were operating on a business model where they sold hardware and they sold software as a product and that was the transaction.


Leo: That’s really interesting. I mean, to me that’s a little bit of like where I see these companies like Mattermark and DataFox going, which is they started by selling the data to VCs, and VCs will pay a decent amount for it. Like maybe it’s five grand a year, maybe it’s twenty grand a year. But they’re not going to pay a million in a year. But if you go to a sales organization, they’re selling a billion dollars in top line every year, and you can help improve it by 3%, they might pay you four million dollars a year instead of 10K a year. And so it’s the same data but it’s a lot more valuable for some people than others.


Nick: Taking a step back, as an early-stage investor, what advice would you have when thinking about data and investing in companies that are either focusing on data or have an element of their strategy where valuable data is being acquired?


Leo: So I think at a minimum, it’s valuable just to think about the data in a company and whether it’s valuable, even if you don’t do anything with that, just to look at products your size and it might give you some ideas that you hadn’t previously considered. And then the other thing is if you do believe there’s a data play, it’s important that there’s somebody on the team that can help with it. And that might be a co-founder, it might be an engineer, it might be an advisor, an investor. Some of this applies, for example, like collecting data really is possible. It makes sense when you hear it but maybe you don’t think of it if you’re not experienced in the space. And so I think that’s where having at least one person that’s kind of been there done that to help advise you is really valuable.


Nick: In my last role, I did an analysis with demographic and psychographic factors for a given customer set. And we were trying to link it too. So we did questering statistical analysis to create what’s called a demogram to understand what the major market segments were within our customer set, and then the sub segments that made up those major segments. So that we could profile customers, right. So, I was hoping that that analysis would then form our product roadmap, so we’d know what sort of products to develop. And unfortunately the segments were too distinct and too small on a market size level, to justify developing products around each of those segments. So instead we just used it for messaging. So for the marketing team, we would just take existing products and we’d position them as safe and secure in one group, and efficient and analytical to a different group. Ultimately, it wasn’t where we wanted to go, but the data was useful.


Leo: Yeah, like I said, I think as your story shows that it’s a good product for size. Even if it doesn’t lead to where you think it will lead, you might still get value out of it.


Nick: So, as a startup founder that projects to have a business where the data is the competitive advantage and creates the key value…. what thoughts come to mind in terms of crossing the data chasm in order to get the necessary traction early on without that data advantage in order to reach a point where a critical mass of data does exist to realize that competitive advantage?


Leo: So I think # Chris Dixon had this great blog post called “Come for the tool, stay for the network”. And that was more for a, like network effects companies I think, like Pinterest or LinkedIn, where a lot of times when people first join, you know, they don’t have a thousand friends on Pinterest, specially if you’re an early Pinterest user. But, you know, you come because it’s a great place to save a bunch of photos that you want to bookmark somewhere. And so people come for that. But then as the network grows, like that’s where most of the value comes. And I think this applies to a lot of data businesses too, where when a person first signs up for Yelp, there might not be a ton of Yelp reviews, but maybe it’s useful because it’s a place to save your Yelp reviews. And so I think that there’s this powerful play of giving people valuable tools. And then as part of them using those tools, you get some interesting data. Whether it’s restaurant reviews or what kind of things people like to read or something else. So I think my best advice would be to build a product people want to use irregardless of the data. And then you’ll have the data that you collect and make that product even better over time.


Nick: I think my parents five years ago didn’t know what Yelp was when I mentioned it to them, and now their grandchild doesn’t know what the yellow pages are, you mean they used to deliver this huge yellow book?


Leo: I actually just moved to a house, a new house about two months ago, and last week I got a small yellow pages book on my doorstep.


Nick: Wow


Leo : I was surprised, I didn’t realize people actually made those anymore.


Nick: And you live here in the San Francisco area, so it’s even more surprising.

Any final thoughts on the value of data or anything that I may have missed that you’d like to touch on before we wrap up?


Leo: I think my parting thought is just that data is very valuable, and if you think through it a little bit in seed stage , it can become a really tremendous asset as your company grows, both for retaining users, for attracting new users for finding out competition. And so I think it’s definitely something that founders should think about even they end up not deciding to do anything with it.


Nick: Alright. Well Leo, well thanks for coming out here and doing this interview in person. I really appreciate the time. And I’m sorry you got lost on the way. But ext time I’ll make sure to give you better directions.


Leo: Maybe next time we’ll be out in the mid west, right


Nick: Yeah, right. Let me know when you come out. Thanks again, Leo.


Leo: Thank you, Nick