008-01 – UC Berkeley Cloud Computing Meetup 008 (October 29, 2019)
Articles,  Blog

008-01 – UC Berkeley Cloud Computing Meetup 008 (October 29, 2019)


– (House Speaker) So,
our first speaker today, is Konstantin Savenkov. He is a CEO and co-founder
of Intento Incorporated. In this talk, he will share insights into the Cloud AI landscape. He will talk about the
challenges large enterprises have procuring and implementing
hybrid AI solutions, and how the Intento AI platform helps some of the biggest companies solve them. So, welcome Konstantin. – I’m, I have some issues
with my words today. So, I’ll be good for 15 minutes. (audience laughing) Two or one of the Skydeck companies part of closed funding co-hosts. We help companies to
procure and deploy AI. So, we help them to find what the vendors are right for them, and to put up production. And I would start mostly, I’m not gonna start here. Sorry, I don’t feel good
talking about AI landscape. And then by the time (mumbles) So, in other, the Artificial Intelligence
stack is pretty deep today. All this advances are possible only because we can build on the shoulders of others, right? However, you need to master it that deep, only in some specific cases
when you use core AI technology or something which you can do in-house. Use some framework on the data,
things like AI, (mumbles). In a vast majority of cases related to computer services, no one builds a big house, well except maybe Amazon, Google, Facebook, just ’cause you need all the
data in the world for them. You need substantial investments, post, hand, time investments. You’re gonna spend lots
of time working it. So mostly, happen is get from someone, they get some base line models, and that way to prove the closest model based on their own data. There’s plenty of systems to for that. We call this simple things
you can do with data. We call them intents. That’s why (mumbles). It’s no intents like in in
turn to attach child locks. It’s different types of intents. Like speech transcription or translation. And more than 200 vendors– actually much more than that. Many thousands of models. And you can discover
them in different places. There’s certain API marketplaces. Like, maybe rapid API. There’s problem over there. Verse major AI vendors, such as Google, Amazon,
Microsoft and others. But that also rules out
a lot of Indian vendors, which excel at creating
data for some super solid, some specific niche problem. And really monitors this data. For example, some time ago, a small German company called Leap Pad, they took about the
same data set Google has dealing with data crawl funding. And they hadn’t corrected
it for some languages, like for German, French, Spanish and they committed that Google
translate those languages. They decided to focus its niche. So, overall we identified I think more than on cognitive intents and I’ll show you the numbers, but should be 400. And we classify them by type of content. We may process like, (mumbles) And like what to do with this content. There is some information
extraction from this content. For classification, for transformation, or something something else. For example, in text,
you can do, what’s first? Yeah, small. So, great, very small font. So, like, examples of texts extraction is edit extraction, classification. For example, sentence analysis. When five times by cent amount. Transformations and translation. And about the similar, it’s not apology. I mean we have that for internet analysis and for an old (mumbles) video. For example, because of (mumbles) processing, where it’s around 23 different intents, such as QR extraction, intent extraction,
linguistic announces content, integration, and translation,
and summarization. And for every one of them, the on average, I think from 15 to 20 different vendors, which provide prototype models. Then, what things we may find there? First is, what type of
models are provided? So, what was made generic and is it’s a pre-trained
model, which is good work in general purpose data. For example, Google
translate offer the process of generic translation model. It’s not suited to certain niches just one size fits all. Then, next level is proto models of for some specific niches. For example, in Microsoft translate, you can get project models for business, or IT work,
conversation translations, if you know how to find them. It’s not it’s not always possible. And, much like, for example, Microsoft, you may find also pre-trained models for different medical classes. And what am I trying to think. Medical comprehend, comprehends, general purpose model and medical
comprehends for (mumbles). That’s sort of, the hot topic is govermentation. Where, you can take this baseline model and use your own data, typically about, three
degrees of magnitude less than (mumbles) stretch, to adopt this model to
your specific problem. It’s not really, I think transfer learning,
it’s not really training. It’s more like, if you give like a cloth to a dog to sniff and instead, what actually you are doing. There are different approach to that. One with approach is data mutation. When you upload your data to the (mumbles) and in their stock group, residually, they find singular data. So, they augment your
data with their data. There is a substantially
a large enough data set to train this particular model. Second approach is transfer learning, and typically, it’s
employed by large companies, such as Google ’cause it’s expensive. And, then, what’s like quite
hot topic for the services, it’s generic mutation. When you not doing, sort of
training a chunk of data, when the model produces some result, you get feedback on this result from human and it can push the feedback back, just one data point, to improve the model. So that if you have some
human induced scenerio. That’s what popular in
Russian translation. When person edits text and
pushes back to the model, the model incorporates the
sentence right to read. In terms of the, oh I think
that is pretty large topic, I’ll skip that. So, basically, so I
what I want to say here, yeah, so, the quality is very fragmented. So, those large vendors, they don’t tap upon data they use to trade the models, right? And the foremost of these
models for your case depends on how your data
is similar to their data, and you have no idea how to send it. So, the only way here is to try to and then you do, you see that this space
is super fragmented. Like, we have lots of
focus on translation. That’s why I’m drawing
examples from translation. We’ll look at that, I will show you, it’s very fragmented. Than, if you use this dominant
intation in (mumble) data, you’ll see that learning
curves a difference. So it’s like not another
initial difference. You build on the top
of this baseline model and, maybe, the model starts high. But, does to improvement of data. And if there’s another model, which starts slower, but learns faster and it doesn’t, it flares out later. So then, when you train, you use some knowledge around your data. And one way to do that is to invest in a human clinic of this data. Another way, some of the AI guidance, they provide services for automated, and the clinic chart is
involved in training. So, if you have the data
it may have on that, some company will work, some vendor will work, better for you because
they’ve got it automatically. In order to figure out which model’s right for a specific case there are different approach. Two main ones are
reference specialization, when you have some golden standard data and you just compare model to the data. It has lots of problems,
including that in many for example, a repeat task, there is no single point of reference. You may have five different translations, each of thems perfect. Another process, human innovation, where you just give
humans and they rate it, and the problem with that
is that it’s too expensive. And we sort of at our company found a way to combine them to make some legwork using algorithms and then give humans a small
sample to analyze to (mumbles). And, so, example, like in machine translation, there is around 30 vendors, so without it, it’s much greater more. Which work with more that
14,000 language pairs, translating texts from
one language to another. And, if you take the most
popular language pairs, you see that in order
to get the best quality, you have to combine on
nine different vendors, depending on the language. I purposely removed brand names from here. Well, you many find that
in our public (mumbles). And last of year, it changed. That’s not last half year
but the last half year the same happened. Each of leadership changed for about, half of the language pairs. (mumbles) There’s domain updates. So, you have those different
translation models. And, then, when you log your data, depending on the subsets of your data, because different land
does improve differently. So, that’s our cloud landscape. The problem with the cloud is that, companies fail a lot on adopting AI, even if this AI’s available in the cloud. Because, oh, yeah, because for them, it’s super hard to procure. Most technologists, since
you have to take away, it’s like 20 different
vendors on your data before you decide which to work with, and that’s not how
enterprises procure software. Enterprises typically, they
will just select one language, and, only after that, try this method. And, second, it’s hard to deploy. Enterprises use some integration solutions with standard connectors
to those solutions, which work well for one to one situations, and need to connect your
earpiece system to your IT desk or something like that. But when you need to connect, like, five AI vendors with
five internal systems, this quickly turns into a mess, like that. So, you have to build all
those peer to peer integrations on top of your enterprise solutions. And, that’s actually, how
we solve all those problems. So we provide a solution, for like streamline procurement of AI, where we provide single interface to all those AI vendors, and we provide some professional
tools for data cleaning, model training, and model scoring. And we have partnerships with
our vendors that handle that. And, also, we provide what
we call Enterprise AI hub. We each encapsulate all this complex and unless they are related to dealing with multiple robotic systems and can be because of that, it provides very simple and single API to all those systems. So, you can be easily
plugged into all those enterprise integration systems. Yeah, and we work with companies like, Ikea, Igenent, all those sizes. I think that’s it. Thanks. (audience clapping)

Leave a Reply

Your email address will not be published. Required fields are marked *