S3 #10 Vitality of data in building and sustaining the metaverse

S3 #10 Vitality of data in building and sustaining the metaverse

S3 #10 Vitality of data in building and sustaining the metaverse

Guest:
Guests:
Or Lenchner

Or Lenchner

For the past three years, Or Lenchner has served as the CEO of Bright Data, 
the market-leading web data collection platform.

 Under his leadership, the company has continued to expand its market base 
and innovation, reaching over USD100 Million in revenues. 




Bright Data serves, and is trusted by, thousands of companies and 
organizations, including Fortune 500 firms, leading universities, public 
sector companies and more.

 Furthermore, the company recently established The Bright Initiative 
organization, which is focused on using data to make a positive impact on 
the world while tackling the most pressing issues of our time.

To support their metaverse strategy, more than three-quarters of leaders, about 84% across the IT, telecom, and technology units, say they are planning to look to procure data intelligence solutions and deploy them inside virtual worlds in the next two years. In this episode, Or Lenchner of Bright Data disuses public web data scrapping, misinformation, and the metaverse, as well as the privacy and security considerations we must keep in mind.

Keywords:
metaverse, web scrapping, data collection, IT, telecom operations, digital operations, security, privacy
Season:
3
Episode number:
10
Duration:
31:13
Date Published:
June 13, 2022

[00:00:00] KRISTINA PODNAR, host: To support their metaverse strategy, more than three-quarters of leaders, about 84% across the IT, telecom, and technology units, say they are planning to look to procure data intelligence solutions and deploy them inside virtual worlds in the next two years.

[00:00:15] INTRO: Welcome to The Power of Digital Policy, a show that helps digital marketers, online communications directors, and others throughout the organization balance out risks and opportunities created by using digital channels. Here's your host, Kristina Podnar.

[00:00:34] KRISTINA: Thanks for joining me today on the Power of Digital Policy podcast. I'm grateful that you've chosen to share your time and getting digital policy-related knowledge. Today, I'm excited to have with us Or Lenchner. For the past three years, Or has served as CEO Bright Data, the market-leading web data collection platform; Bright Data serves and is trusted by companies and organizations, including fortune 500 firms, leading universities, public sector companies, and more. Or, it's my pleasure to have you today. Thanks for taking the time to hang out. Let's just jump right in, right. Bright Data is known as the world's most trusted, automated web platform. What does that actually mean to be a web data platform? Can you help us understand?

[00:01:18] OR LENCHNER, guest: Sure, I'll be happy to do that. Bright Data is a leader in developing software that allows businesses and organizations to extract publicly available data from the internet on a large scale. Now, what does that mean? The web, and let's start by talking web 2.0 for now, is probably the largest database in the history of mankind and is just growing in staggering peace on a daily basis and holds more and more information. What we're doing is allowing and helping our customers to extract that public data and only the public data. For example, Prices of products, on a large scale, in real-time. So, they will be able to use their own machines to get real-time decisions. For example, large e-commerce companies are using our services to understand how their competitors are pricing the products they both sell and to be able to reprise their own products in order to win the customer, to convert these visitors into paying customers, by giving them a better price or faster shipping time and things like that. So this is basically what we're doing, in a very large scale for around seven years. With over 315 employees and I am in Israel and serving over 15,000 customers.

[00:02:55] KRISTINA: So, Or, help us understand this because you know, you're obviously doing a lot of scraping of data, but you mentioned non-personal data. When I think about the web, I think about obviously prices, products, and services, but I also know that my mortgage information is out there. My salary information is out there. How do you ensure that you don't collect any of the user's personal information? Is that deliberate, or do you happen to collect it and then clean it out afterward? How does that work?

[00:03:27] OR: Yeah. So that's a great question. And the short answer is both. So we're just making sure not to collect private information. The first thing that we're doing is to actively not collect data that is behind any type of login; if this information is on the web, but you need to log into a certain account to view it, or we're not going there. So that's, that's the first level. If, by any chance, we have, and I would just say one of our customers has collected PII, personal information identifier. We are also able to clean out and remove it from the actual dataset if that's what their regulation requires, but eventually in the most basic home visit, if you can see information on a public website within your own eyes, the only thing that we're actually doing is automating it, and instead of having a million pairs of eyes, looking at this information to collect it in large scale in no time, we're automating these million users in one computer that is doing.

[00:04:36] KRISTINA: That makes a lot of sense. Tell us a little bit about this recent survey that you took underway. You were basically looking to gauge, I think, business sentiments around the metaverse, which is very interesting and obviously a buzzword these days. How do organizations perceive the metaverse? What are they thinking in terms of the metaverse and planning for it?

[00:04:56] OR: So that was a very interesting exercise to do because us, as a business, we're still sure about what addition means. And I think that every business owner or CEO or anyone, I really,  would probably say something very similar to each other, but a bit different, because it's not really a thing it's like almost there, but we're not there yet. So, we conducted a survey, very live survey. Try to take this issue from various directions, basically asking the relevant stakeholders in different businesses, what they think about the metaverse, but specifically from the angle of data and why we got slightly different versions of answers about what the metaverse is really is. What we have seen is that everyone thinks the same when it comes to data and metaverse, and the gist of it is that web 3.0, whatever it will be, eventually we'll hold massive amounts of data. A lot of private data. This is, again, not our part, but it will obviously hold it off private information. For example, well, you will have a private discussion with someone's avatar. So, the better verse we'll discuss that's probably if you're in a private room, I can guess that it will be considered to be private data, but a lot of public data will be generated in the metaverse. It's, it's pretty clear that in one way or other, products will be sold into the metaverse. It means that the prices of these products, the reviews of the products, all of the things that exist today in the web 2.0 will also exist in 3.0, whatever in the metaverse again, not doesn't really matter what the exact for who will be. It also means that new content will be generated in the metaverse. And so basically everything that we see today in web 2.0, we can assume will also be visible and valuable in the public metaverse, and actually, 54% of the respondents to that survey believe that data will be vital in sustaining them in the metaverse. And I can understand why, because it's all about content. That was always the case were 1.0; it was also about content. Just a very limited amount of contributors; in web 2.0, all of us contribute content through social media, writing blogs, and things like that. So, in the metaverse, it's also going to be a lot of the content and content equal data.

[00:07:33] KRISTINA: I'm really happy to hear you say, first of all, that we're not in the metaverse yet, because I think it's very easy for a lot of executives and companies to get confused, looking at all of the headlines that we have today. It's all about like we're in the metaverse; the fashion industry is in the metaverse. Everything's already in the metaverse, but you're saying we're not there yet. Hold the boat for just a minute.

[00:07:52] OR: Yeah. So usually, exactly what you said. You read the headlines that a big fashion brand bought a metaverse studio or an NFT design studio. It doesn't mean that we're there. What it actually means is that many companies understand that might be something there, there they have a fear of missing out, or they're doing something, but you know what, the best evidence that we're not there yet is that we're conducting podcast interview over zoom and not with VR or AR instruments without avatars.

[00:08:30] KRISTINA: Excellent point. Excellent point. So as businesses are thinking about the metaverse, are they also thinking about security and privacy and accessibility, or are they just really thinking about opportunity or fear of missing out, as you mentioned?

[00:08:42] OR: Yeah, so definitely they're thinking about that. I have to say that it was very hard to compose the exact question that we will understand as the ones who conducted a survey. The correspondence will, the other side will also understand how to answer, and so basically we asked the very general question and basically, 60% of IT and tech leaders, the participant in the survey plan to integrate business operation in the metaverse are pretty worried about the security. What it actually means, I am not sure that they really understand, but that's okay because not today, no one has really protected without the metaverse. So at least most people have the understanding of them so that the metaverse will be the most secure place. Maybe you will have a lot of resources invested in trying to make it a secure place. In regard to data and with regard to other elements, but eventually, this is a good concern to have. And you know, if you're going, if, if this will be a major part of your business operation, eventually it's okay to be concerned about the security. I think it's actually a healthy thing. And it seems that over half of the people we talk to actually have this idea already.

[00:10:06] KRISTINA: From your survey or just from your own experience, are you seeing organizations starting to take on training or maybe making individuals more aware of what value data has in the enterprise as a result of the move to the metaverse? Or is that just something that's very organic and still very strategic?

[00:10:23] OR: It's strategic, but we have already seen it also unfold in the more tactical areas. Well, we already are in discussions with our more sophisticated customers about how we can help extract public data from the metaverse when it is actually a thing. From our point of view, as a product as a tech company, we have already started to experiment with these things because we want to be ahead of the curve when it's already a thing. So it's not just a strategy. I can also see things that are happening, but still in a fairly small scale, more experimental mode than actually going into production.

[00:11:10] KRISTINA: You're highlighting an interesting point to me, which is the delta between scraping of data and capturing of data versus capturing the context of data. Two different things. So when we're in the metaverse, it's more than just data. It's about the sounds, the fields, potentially the senses, right? The pain that I'm actually feeling on my wrist as a result of wearing the new band. So I can actually feel senses. So how is that going to change, do you think for your company and also just for all of the organizations out there, because we're no longer talking about two-dimensional data, we really are talking about data plus context, plus so much more information pilled on.

[00:11:48] OR: That's a great point to talk about because it's something that is so interesting to think about. I'll just give you an example of why it's so interesting. So, let's say we're conducting these interviews in the metaverse in a public forum. So we have a crowd and audience; they can join it. They can listen, and the distance that I'm standing from you or the distance that you have from me, and when I'm speaking, maybe I'm taking one step towards just getting a bit close. Right. Is this data that we are allowed or should we be collecting? Because probably, as you said, that's context. Now, if I'm going back, it's like, you know, right now, maybe I'm not feeling confident with you. I am, but just for the examples, so maybe I'm backing off. So that's my body language. I guess it's also going to happen in one way or another in the metaverse. Now, is this public information or is it private information that can reflect this context is the emotional context statement. This is why it's so interesting. We, as a web data company, invested a lot of resources. Look, just our real brain resources, also our engineering resources and trying to differentiate, not just because between what private and public, but it's also what between what's ethical and what's nonethical data to collect. And I think that we're doing it in a very good, responsible way, mostly because we care and we've been doing it for years. But what you just said, the context is going to be completely new in the metaverse. And this is something that we'll have to evaluate, I think, on an hourly basis just by seeing every single new use case that our customers arise and ask us to help and to consider if this is something that we need to classify and deal one way or another private, public ethical unethical. So that's extremely interesting for us. I just see it as a challenge, not to the, not as a problem. That's one of the most interesting aspects of the metaverse. I totally agree. The context.

[00:14:01] KRISTINA: And arguably, mean, if I think about who you are in the context of GDPR, for example, you would probably just be a data processor rather than a data controller. Especially if a company were to hire you, to scrape their data, to actually give them insights into their environments, but are you seeing leaders thinking about that already in the context of their employees and in the context of their customers? Because, for example, if we're in the metaverse, and I'm a customer, there's so much to be gleaned around my sentiment. Like you said, am I leaning in because I'm interested in my leading out because I'm not feeling comfortable in what I'm wearing? Where are my pupils moving? Are they dilated or not? And what's my heart rate. Am I excited by the product you're showing me? Or I was like, oh gosh, I'm snoozing. So again, back to that context, and I'm wondering, during your survey or even in your experience, have you gotten a sense yet of how much of that organization's thinking about because the concern I always have is we're going to deploy all of this really great technology, but we're going to forget about the basic as you said, ethics, that what I call the basic hygiene, right? It's like the one-on-ones and what we shouldn't be doing. And are we going to loop back to exactly where we've been in web one and web two? Which is our privacy is at peril in exchange for an experience. And, that might be fine for some people, but not for others. And so how are organizations dealing with that, or are you even seeing the notion that the metaverse should be different than web one, two or is it just sort of like, Hey, business as usual new technologies?

[00:15:35] OR: Yeah, that's a good point. So first of all, at Bright Data we are both data controllers and processors; I think that that's really a GDPR as being both. I think the GDPR, CCPA, and those, the other regulations are a good thing. Finally, a few years ago, someone came in and said, Hey, this is what you can do. This is what you can not do. Great. All you need to do now is to comply. It's actually not that hard. Yeah, it was hard. The heavy lifting was a bit complicated just because you need to learn a bunch of new terms and adjust afterward. It's easy. It's better to be under a specific, organized regulation, not in a gray area, because you know that you're on the good side. In web three-point, I think the tone of this is pretty much going to be forgotten, but I can't really blame anyone because the whole point of the metaverse is to be visual about everything. That's like the whole; that's the major difference in web 2.0. I actually use tools to hide. I don't accept the cookies and a disclaimer. I don't accept it in many cases. And sometimes, they use browser extensions that make sure that no one will track me. Fine, that's my decision. And the other side must comply with it. That's the law. In the metaverse, I don't think that I will join a universe as a ghost, just as a viewer. I don't think that there would be a possibility of being a ghost in a universe. So by definition, I will need to be out there with my visualized avatar. Maybe I can come up as a different person with a different identity, which is fine. Maybe this will be my way to stop others from tracking me, but they won't be able to be there without anyone seeing that's a major difference. I can visit any website today, completely anonymous. I don't think that this will be applicable in any university in the metaverse. Maybe, for sure, not anonymous, maybe under disguise. So maybe it is this disguised quote-unquote will be our way to keep our identity private in this new world.

[00:18:05] KRISTINA: Keeping identities private is really interesting, especially in the web 2.0 world. Considering the fact that you're looking for data. In some instances, there's no need to alter the data. For example, I don't know. Let's say that I want to buy a pair of Adidas shoes. You know, that's the price. You might see a slightly different price in different countries because of taxes or whatever, but for the most part, not very sensitive data, not something that companies would want to mutate or hide or ghost, if you will. But we do have other types of public data that is generated through bots, which are almost like ghosts. Are you differentiating right now the data that you're collecting around, whether it's bot-generated versus human-generated. And do you think that that's going to be something that changes as we head into the metaverse?

[00:18:51] OR: Okay. So, first of all, I have to disagree with the first assumption. So actually, what we see is the organic content, the actual example that you gave, for example, a shoe brand. But it's true for almost every other brand, we'll show different content. Price was the example, but not just price to different users coming from different countries, significantly different prices. And they invest a lot of resources of their own resources in doing this segmentation. If I were to share in the chat right now a product page or random e-commerce website, you and I would see completely different values even if we'll click on the link at the same time and share our screen. It's not just the price. It's obviously the shipping times because I hear you're there, but it's also different reviews that we'll see maybe different images of the same product being on the website and webpage at the same time; the power of this company is to segment the information that any user will see is unbelievable. Now, in many cases, it's not a bad thing, but I prefer to see content that is specified for me because I will decide faster. I know I don't have much time. I want it to be relevant, but in many cases, this actually creates some type of discrimination. Why should I pay more than you or the other way? So we see a lot of that in almost every industry; e-commerce, travel, social media, and the news that we're consuming almost everywhere. It's pretty unbelievable that, I mean, it's everywhere all the time to your question. Some of our customers are actually using our services as we are at the infrastructure level. So we help our customers extract relevant data. We don't extract it. We have big major companies using our services to extract the data in order to try and understand if it's real or fake. So we definitely see that. And a lot of them, and there's a growing amount of fake data on the web and also growing the amount of extracted data that is being extracted to try to identify between what's real and what's not. And you can see, we see our, our crawlers are running on social media and websites and, even on e-commerce websites to feed this information to some of our customers to try to understand what's really not.

[00:21:41] KRISTINA: In that context, how familiar are IT departments with the technology, the data processing. I know that you looked at this from your survey perspective with regards to the immersive technologies, but how fluent are IT departments or even other parts of the enterprise and understanding the data, the context, what can be collected, and how things need to be analyzed and deciphered. I know you are at the infrastructure level, but what sense do you get around the skillset and familiarity with the technologies?

[00:22:14] OR: Yeah, so they are mostly meatier than ever. And I would say that, well, that level of penetration of this understanding in many organizations is already pretty high. We know that because once they realize that they want to have a data science team to analyze data, then they get to the point that, okay, now we need data, and then they come to us, and we see, you know, it's looking at our growth as a company. We're a pretty large company already. And the number of customers that are joining on a daily basis and from where they're coming. This is already a very well-educated market, and I can see the difference. I mean, five years ago, we may be even some cases we had to explain to even to, to the right person, to IT person or to the data team, why they need web data. That's not the case, more for like what I would say two or three. And then they consume more data than ever, and they know how to get it. They know that they need it. It's more of a matter of no a price, how much it will cost them from where they can get it. What's the quality of the data. Can they trust the data? Less do I need a data? Now we went, we went a few levels up to that's the data I needed. It must be qualified. I need the fresh, I mean, it's 5,000 an hour. And then, so that's a whole different level of discussion that suggested that many companies are already there. And I'm not talking just about companies that were like born into the digital era. I'm talking about banks, very old banks that are, are accustomed to. Or insurance companies that have been around for a hundred years and, and they did the sh this shift to the digital world that they realized that they need data to, just, to be better, to be more competitive, to know more. So that's like, I feel that's, we're already here.

[00:24:15] KRISTINA: That's reassuring. It's reassuring to understand that no, there's been not just a transformation within organizations to digital, but really there's been a transformation in terms of people and skillsets into the type of skills we need for what's coming, the metaverse. For organizations that are looking at the headlines, I hope they'll take away from your commentary around don't panic. We're not in the metaverse yet, but what should they be thinking about yet? Yeah, that would be the critical part. For those that aren't missing out on anything yet, but see maybe, you know, competitors jumping onto the NFT bandwagon or maybe considering blockchain, what should they be doing right now? What is like the two or three things that every CEO or every executive team needs to have a top of mind as they think about we're heading into the metaverse? It sounds like from your survey, we're in pretty good shape in terms of skillset. What else should we be thinking about? What should we be worried about? What should we invest in?

[00:25:14] OR: First of all, just stay on top of it. No, just read, understand, and hear what's going on, but be more specific. As companies, we have to prioritize what problem we're going to focus on solving. Now, each company has its own thing for an e-commerce brand. No, it was just, how do I sell more of my products and every quarter or a year, you have different problems we need to focus on until the solution to the problem that you want to focus on will be coming from the metaverse, I don't really think that you should actually do much to be there. You shouldn't come in too late. You need to look ahead. I already know as being a data company that I'm going to be first when my customers need public metaverse data. So that's a problem I need to solve. So I'm already starting to experiment with the snap, but if that's not a problem that you need to solve in an excitable 12 months, just stay on top of it. Read the news, give me a call, but that's, that's about it.

[00:26:27] KRISTINA: I liked your emphasis also of experimentation because a lot of, I think, organizations feel like it's either an all or nothing. And it sounds like your proposition is now; it doesn't have to be all or nothing. There can be an in-between, you can be experimenting, but you don't have to actually take the dive. In experimentation, awareness is really what's going to get you ahead as we enter into the metaverse because you'll have your pulse on the market.

[00:26:51] OR: Yeah, there's actually a risk to doing it too soon. You need to have a product-market fit and everything that you're doing. If you're trying something related to a metaverse, it's not working. Don't force it; you lose, I mean, it's probably not the time or not the right product or not the right audience, or you just came in too early or whatever. So, you know, experiment in. I agree. Just make sure it's a limited experiment. You also do a two-year experiment and then lose everything.

[00:27:23] KRISTINA: Or, I wouldn't go back to the survey briefly. I want to pulse you a little bit on what are the surprising data points or insights that you obtained from the survey? Was it pretty much everything that you thought it was going to be? Or what are the things that were just really surprising to you?

[00:27:40] OR: What surprised me is that I think that almost 90% of management, so C level VPs across IT and Telekom and other tech sectors they, are planning to look at data intelligence solutions to be deployed inside the virtual worlds in the next two years. So this was more specific than I thought. I thought I expected to get lower numbers of ventures that actually invest some thinking in it, but almost all nine out of 10 said, oh yeah, we're, we're there. We're thinking we're at least we're thinking about it. That was higher than I expected. Some of it is just due to the nice headlines that we talked about in the news, but I felt it was genuine, and they're actually, you know, clearest understand, trying to learn, trying to monitor this area and then think about it also specifically from the data perspective.

[00:28:42] KRISTINA: No, that's really great insights and a lot of food for thought, I think from that perspective, and also for anybody who's in the digital policy space to be thinking about how do I capitalize on the risks and the opportunities that we're facing from the metaverse and also from business operations. So that's a really great insight. I'm actually personally hoping, or the next time we speak, that I'll actually have my headset on, or maybe a headset that's not as heavy and bulky as the one that I have at the moment, but. In terms of the next things that you hope for, in closing, why don't you give us what you hope that we'll see or do or be in the next 18 months from your personal perspective, CEO, data wiz, what are you seeing or hoping for?

[00:29:25] OR: Yeah, I actually hope that we'll just get one real valuable application that is somehow related to metaverse specific because I actually shared the same concern with these managers and C-level that we talk to. I want to be swept away with some killer app that will be there that I will have used even on a personal level. Not, not even, not as a business, just to make sure that this is real; this is going to happen because a lot of capital is being deployed into these areas today. People who are inventing various things, some of them on paper, sound amazing. I want to be swept away with this killer app that everyone we use will understand. Okay. It started. Metaverse is here. So I hope to see it the next year.

[00:30:25] KRISTINA: So that's a wonderful challenge to leave all of our listeners with, take on the challenge, take on the pursuit, make or proud. Let's develop that killer app. Thanks so much for being with us today for sharing your insights and the trends from the survey; really insightful and very much appreciated.

[00:30:42] OUTRO: Thank you for joining the Power of Digital Policy; to sign up for our newsletter, get access to policy checklists, detailed information on policies, and other helpful resources, head over to the power of digital policy.com. If you get a moment, please leave a review on iTunes to help your digital colleagues find out about the podcast.

You can reply to this podcast here: