Host David Tan brings back some all-time great play calls and gives a glimpse behind the scenes of the AI that was used at the US Open Tennis tournament and how it could help an MSP.
The hyperbole is done. Now we can finally play the game. Look at that, marvel, you're this one man. Goodbye, hello High School, whoa Nellie. Let me tell you about Keith Jackson. For those of you that don't know, that is the voice of the late great Keith Jackson. In my mind, one of the greatest to ever do it, one of the greatest to ever play-by-play commentators of all time, is the voice of college football, many may say, famous for his saying, such as whoa Nellie, and take him to the woodshed. And that call the 1991 call of Desmond Howard's punt return for a touchdown against Ohio State that potentially clinched him the Heisman Trophy. And I am sure you're wondering why we're starting off today talking about, or with a clip of, keith Jackson. I'll get to that in just a minute, but first I want to say hello and welcome to the Crush Bank AI for MSPs podcast, the podcast where we talk about all things artificial intelligence, machine learning and how they relate to managed service providers. My name is David Tan and I appreciate you joining in. I'm real happy to be here with you. Hopefully this is going to be a fun episode. We're going to talk about a few things that are very near and dear to my heart, and not the least of which is sports, in particular college sports, pro sports of all type, and play-by-play commentary. And you're probably wondering, like I said, why we started off with that clip and why we're talking about play-by-play announcers. I'm going to get to that, but I'm going to take a step back first and we're going to start by talking about something I did this past week and that was I went to the US Tennis Open out here in Flushing Queens, not too far from my hometown, the site of one of the four major tennis grand slam tournaments, and I had the privilege, along with a couple of my partners here at Crush Bank, of being the guest of IBM for the tournament. You may be aware that IBM is a huge sponsor of the US Open have been for 20-plus years. Ibm is a big sponsor in tennis and golf particularly. They invest real heavily in tournaments like the Masters and, like I said, the US Open, and really do a great job of not only sponsoring the tournament but making the data and the analytics and the statistics and things like that sort of approachable and understandable both for the casual and the serious sports fan through the tournaments, through the announcers, through the use of things, like the app that goes along with it. So just really do a great job, like I said, of making that data available and accessible, and that was a lot of what we spent some time talking about. So I was super fascinated. Obviously, I love the tennis. I love the atmosphere. If you've never been to the open, highly, highly recommend it's a great way to spend a day and we actually went for a night session, but fun way to spend the night as well. Tons of things to do. I love going early on in the tournament because you can go out to the outer courts and see some great players playing tennis, not more than 10, 15 feet away from Yale, and there's always something going on and they've turned it into a great spectacle over the years. Like I said, highly recommend it. But back to the reason we were there, like I said, ibm being a big sponsor, they have a very big presence there and IBM really drives all the data associated with the tournament. So when you're watching the event on TV, you get statistics. You know the basic statistics, things like first serve percentages and winners and unforced errors and all the things that you would understand but also the analytics like percentages of point victory when a first serve is above 120 miles an hour, things like the likelihood of victory based on the conditions of the environment and the opponent and the strengths and momentum, and things like that, all driven by the data that's collected, and I was fascinated. We got a little bit of a behind the scenes tour. I was fascinated to hear that IBM collects approximately 50 data points per shot. So just think about a typical tournament or typical match. I should say in the tournament I'll use a men's match as an example Men's play. If you're not familiar with tennis, men play best three out of five and tennis is a is to six and each game is essentially to four. I'm oversimplifying it. Point being, there are hundreds and thousands of coins in each and every individual match and the tournament's made up of 128 players on the men's side, 128 players on the women's side, so that means a total of 254 just singles games. That doesn't count doubles and the like. 254 matches are played. So just the raw volume of data there is really inconceivable when you consider, every single shot in the tournament generates 50 points of data. So the things that IBM can do with that are really pretty incredible. And if you download the US Open app, you can see real time data, real time statistics, real time analysis, all driven by AI, and that's obviously why we're talking about it today. Those of you that know me or that know Crush Bank know what a great partner IBM is for us. We have been working with them tightly now for six or seven years. They really drive, help us drive, a lot of things we do. We love their investments in technology, we love their platform and, like I said, it was exciting to see it up front and in use in real time. But I thought what was was incredible was the multitude of ways they were using AI, particularly generative AI, which is going to bring us back on point in just a couple minutes here. But really it was fascinating where you can see, like I said, in real time, the data and the predictions and the predictive analytics really change as conditions change, right, so as points are won, as games are won, as momentum things that are more of a nebulous theory those things take over. You can see in real time how the data changes and, again, they're all AI driven analytics and they're all IBM Watson, but something that IBM is doing that they've started doing recently. They started actually, I believe, with the Masters back in April of this year and I thought it was a very interesting proof of concept. I think they've honed it a lot. You know we're now in the September timeframe. Tournament starts late August, early September, so five months or so since the Masters, a lot has changed on IBM's capabilities around generative AI. They've obviously announced to Watson X and launched their entire platform and I think they learned some lessons from the Masters and some feedback they got there and I think they've turned this into a really slick sort of solution again based on all of their technology. What I'm talking about is, if you go into the US Open app, you get a series of highlights, hundreds and hundreds of highlights right every point, every set, any exciting shots serves returns, you name it. There's hundreds of highlights in there for every day and for the entire tournament, and they don't have people that do the broadcast, do the play by play analysis of each of those shots and record them. So what they do is they basically use the Watson X platform and the generative AI to create that play by play commentary. So in other words, if you pull up a point between, say, novak Djokovic and Daniel Medvedev, they didn't actually even play in this tournament. But they were just the two first names that came to mind. So if you pull up a point between the two of them, you will hear an analysis. That's essentially a computer generated voice that created that play by play. So it's not someone recording it, it's it's created through a generative AI and I found it kind of fascinating and it also made me a little bit kind of sad for a few minutes while I thought about it. So I'm going to get back to the technology in just a minute, but that's kind of why I started this podcast with a very quick click of Keith Jackson. That play just so happens. Give you a little bit of insight into me. That play happened in 1991. I happened to be a student at the University of Michigan at the time. One of the highlights of that season, obviously Desmond Howard won the Heisman Trophy. Michigan had a good team. Probably lost a bowl game if I had a bet. I don't remember off the top of my mind, but just, you hear Keith Jackson, you hear some of these great commentators whether it's a Vince Scully, phil Rizzuto, howard Cosell, you name it. You know it's a big event, right when Jim Nance comes on and says and hello, my friends Jim Nance, here in Butler Cabin where later today the green jacket will be presented at the beginning of every Masters. You know well, when Brent Musburger used to say you are looking live, you knew it was time for the Rose Bowl and I've always associated. You know I grew up. I didn't have a TV in my room when I was a kid, so we used to listen to Phil Rizzuto do play by play of Yankee games and it was incredible. It was an entirely different experience than watching the game on TV, because these announcers had to paint the picture for you and Rizzuto was great in so many ways, but he would also just go off for hours on end about things completely unrelated to the game. He got completely lost in the moment in the conversation and it was fascinating and and seeing what Watson was doing and having listening to it kind of made me, like I said, a little bit sad. Is this the future of, you know, play by play, sports broadcasting? If you think about some of the things happening in the industry, in that industry right now I'm sure any sports fan out there right now is probably familiar with things like layoffs at ESPN there are incredible amounts of money being paid for television contracts. So certainly, live sports is is alive and well. The NFL package goes for billions of dollars, the CBS pays multiple billions of dollars for the March Madness, but the it's become a little bit of a unsustainable financial model. So you get to, you get to worry or you wonder are these networks going to turn to computers quite frankly, artificial intelligence to start doing this, this play by play broadcasting? Are we going to lose some of the great calls of all time in the future? Are they going to become sort of a relic of the past, like? Like so many other things seem to happen. I certainly, personally, I certainly hope that's not the case, for a multitude of reasons. But then it also got me wondering. Right, I've seen and I've read about artificial intelligence that can mimic a voice with only three seconds of audio. Quite frankly, I think that's even worse. The last thing I would want to do is train a model on what Keith Jackson's voice sounds like and train it on how a football game works and then let it do play-by-play broadcasts as Keith Jackson 20 years after he has passed away. That would be an abomination, quite frankly. I truly hope that's not the case, but again. I also hope it's not the case that we start to lose some of these great moments and they just become computer generated. I don't think that'll be the case, but I was just sort of fascinated thinking about that. But really what I'm equally fascinated about is just sort of the technology and what goes into this, so I want to spend a couple of minutes talking about that. I thought that might be fun and interesting. I often do sessions and lectures and presentations and such around artificial intelligence, particularly when I'm talking to managed service providers. Obviously we talk about what we do around everything from semantic search to data knowledge management and operational automation, around things like predictive analytics and text analytics, things like that. There's some really cool things in the abstract that artificial intelligence can do, but it doesn't become real until you can start to see it in practice and see it in production in real world use cases. This was again such a fascinating one for me. Being such a computer nerd on one side but also a fanatical sports fan on the other side. To sort of see those two come together was pretty cool for me. If you take a step back for a minute and think about this, what went into this type of technology with this type of solution. What I had to do, it's really a multi-tiered approach to this technology and to this creative output. That's the other thing that's sort of fascinating is we think about these things in a vacuum. Let's just talk about one thing that I'm going to go a little bit deeper in is what we call visual recognition. If you know, most people understand what visual recognition is, but I can oversimplify it by saying if you can train a computer what a cat looks like, what a dog looks like, what a giraffe looks like, and you can show it then a bunch of pictures it can identify dog, cat, giraffe, so on and so forth. That's fairly fundamental. That visual recognition is the underlying technology for everything from good technology like autonomous driving, having the cars recognize what's on the road and being able to adapt and see in real time. Also some technology that's not necessarily as good facial recognition, being able to identify someone in a crowd for good or for bad. I actually think it's more often. Hopefully it's used more often for good, but certainly there's a possibility of invasion of privacy and things like that. That's the foundation of this. The first thing that you need to do to build this type of a solution is visual recognition. You need to train a model, a computer model, that can watch a tennis match and identify what's happening. Is that a forehand, is that a backhand? Is that a ground stroke? Is that an approach, drop shot, serve, second serve, fault, ace. You get the idea right. All of these things are picked up by the computer visually, so it watches the video and it converts that video into ones and zeros. Right, this is all data driven, like anything else. And then those ones and zeros get put through what we call a generative AI model, and we're all familiar with generative AI at this point. Right, we understand. It's about a computer that can create content. Hopefully you've listened to some of my previous podcasts or seen me speak and talk about what generative AI is and how it works. But in this particular case, ibm had to do something interesting which, quite frankly, is fairly common in the generative AI space is they have these foundation models. The way IBM Watson X works is they have a bunch of foundation models which are the underpinnings of this technology, meaning these foundational models know how to speak English in this case. Right, we'll just say they know language, they know how to talk, they know how to form sentences, paragraphs, things like that, but they don't know tennis. There's no open source model around tennis lingo. So what IBM needs to do is they need to take one of those foundation models and layer training on top of it, fine tune it if you will with tennis lingo. So again the same terminology I use when one player standing behind the baseline throws the ball up in the air, hits it and the other into a box on the other side that's called a serve, and IBM needs to train Watson that that is a serve and that's the term you need to use. And then if the player doesn't return, it's called an ace. If it doesn't go in the box, it's called a fault. You get it. I'm not obviously going to explain tennis lingo here, but they have to train this generative AI model on what tennis is. So at this point they are taking visual recognition, turning that data into ones and zeros, putting it through a custom tuned generative AI model and creating an output. And then that output has to go run through another AI system, which in this case is speech to text, I'm sorry, text to speech, the other way around. So in other words, the machine is creating the text and then a voice is speaking it In the app if you listen to it. I believe there's a few choices. You can choose the voice that comes out the other side Nothing famous, obviously, but think about it, everyone's familiar in their GPS In your car. You can make the voice sound just about any way you want, right, you can give it a British accent, you can give it a French accent, you can have it be a man or a woman. Same idea here. I mean theoretically, like I said, looking into a bit of a dystopian future, we could have it sound like someone that we know or that has passed, or that was an all-time great things like that. But really, at this point we're combining or IBM, I should say, is combining four or five different AI technologies, and that's where this stuff becomes incredibly cool and incredibly fascinating is when you can see it in reality, but also when you can layer it together. So, like I said, just to give you, just to kind of wrap up or review, I should say, things that we talk about here, it's a visual recognition model which turns that into data, which then gets fed into a generative AI model. That generative AI model is custom trained on the domain, in this particular case of tennis, that generative AI model then creates the play-by-play commentary which gets fed into a text-to-speech which then gets read out, and this all happens in real time. These are all highlights, obviously, in the app in this case, but this can all happen in real time as someone's watching the match. There's no reason why it couldn't just be generated in the commentary for you. And what I thought was actually kind of cool about it was when I took a step off my soapbox and stopped worrying about our computer overlords and dystopian future, I started to think about actually the interesting possibilities of it. So instantaneously this can happen in hundreds of different languages. Virtual anti-language spoken a model can be trained on. So wherever you are, whatever remote location you are, in whatever remote language you speak, if you can get access to these systems, you can now get play-by-play in your native language. So that's incredibly cool. As I say, I'm a tremendous sports fan and you watch a sports broadcast and they'll often have the alternative broadcast in, say, spanish, on another language, but they certainly don't have it in hundreds and hundreds of languages. So I think just the approachability and the availability of that for people of all different backgrounds, ethnicities, from different countries speaking different languages, I think makes it incredibly cool. There's also certain accessibility right, so you can make it easier or harder to understand people with difficulty hearing or different levels of understanding. We say we can fine-tune that output to different people. So, really, what you're doing in a lot of cases is leveraging this AI to make this technology or make this entertainment, I should say, more available to more people, and I think that's incredibly cool right. So, yeah, we may run the risk of ESPN deciding they don't need play-by-play commentators anymore. I don't think that's going to happen anytime soon. I certainly hope it doesn't happen because, like I said, so many great memories of what play-by-play sounds like just listening to a TV or radio broadcast. But, on the opposite side of the coin, being able to make this entertainment so much more available, so much more accessible, is incredibly powerful. I mean just to think that you can open up the ability to watch the US Open Tennis to people of all walks of life all over the world. I think that's incredible right. It just increases the size of the audience. Tennis is just the threshold right. That's just the forefront. Ibm's doing some incredible things. This is going to be happening in all sorts of different sports and all sorts of different events. I have no doubt about it Personally, I don't mind if we create millions and millions of more Michigan football fans throughout the world. I certainly would be in support of that. So I think that's kind of cool. I think it's really interesting to think about the Waze technology again, in this case specifically generative AI, which is so powerful, the Waze that can start to impact our life and what it can do for people. And again, I think it's interesting to think of this layer together, right, we often look at these solutions in a vacuum. What can you do with just generative AI? What can I do with visual recognition? What can I do with text to speech or even speech to text, vice versa? Really, it becomes much more fascinating when you can start to layer these things together. I think the more you can dig into real world examples of what this is and how it works, it really helps for me personally. It helps my mind start thinking about other interesting examples real quickly, as I kind of wrap up here, just think about this as an MSP. We all want to cut down on or not even cut down on staff. We all want to optimize our staff. We want to make them more efficient. We want to make them more effective. We also have trouble hiring people and finding the right skills and bringing new people in and retaining people. Well, what if there was some sort of a technology where it was sort of next step beyond an IVR phone system? Right, you can answer the phone, you can have a conversation with someone or make it seem like you're having a conversation, which is really a computer on the other side. You can take it from a customer. You can run that ticket through some sort of a search or retrieval, find the answer to a question, then run it back through generative AI and have it speak the answer. So user calls in, gets a computer voice on the other side asks the IP address sorry, the address of a VPN, for example. Hey, I need to connect to my VPN. What's the address? It can retrieve it, it can speak it and, before you know it, that answer is solved. That problem is solved and no human interaction was involved. I'm obviously oversimplifying it, but it really lets your mind think about what some of the possibilities are in all industries. I'm clearly laser focused on managed services and IT service providers in general, but just all sorts of industries. So it's super fascinating and I'm very excited we're at the forefront of a lot of this stuff just to see what's going to happen over the next three to five years as this technology starts to really evolve, get sophisticated, get more accurate, get more understanding and, quite frankly, as more people start to train it. I always say it's all about the data, it's all about the training, and we need experts to train it. In this case, ibm found a bunch of tennis experts to train a model on what the voice of a tennis match sounds like, and it was able to turn that into a product. The more we can get people training AI systems on different domains of data, the more capabilities, the more powerful these systems come, and the possibilities are limitless. So I know I'm, for one, excited to see what's to come and even if that means I occasionally have to sacrifice another great play-by-play call, at least we know that technology is evolving and that the human race is going to continue to benefit from what's being done. I want to thank you again for listening to the AI for MSPs podcast from Crush Bank. As always, I'm David Tan. Please reach out if you have any questions, comments, things you'd love to hear me talk about. We'd love to hear some feedback. You can reach me anytime, davidatcrushbankcom. Thank you so much for tuning in and I'll speak to you again next time.