This week, we had the honor to interview Jake George Schuster, the CEO of Gemini Sports Analytics, a new data science as a service company, which is making data analytics accessible.
Picture: Gemini Sports Analytics’ automated machine learning.
Picture: Gemini Sports Analytics’ project export.
📝Show Notes: Throughout this interview, we touched on Jake’s background, his company the benefits for teams to use the software, as well as his vision of the sports science space. Then we talked about the NFT space and his plans for the next 12 months.
Best Quotes: Here are some of the key discussion points and best quotes from our conversation with Jake:
On his background:
- “The first thing that you should know about me and my background is that I am not a data scientist, nor do I play one on television. I’m a person who sees a problem and a solution in an industry and knows the right people who can bring it there. My background is strength and conditioning, and my PhD work is in biomechanics. I moved into sports science and technology relatively recently in my career. I was fortunate to work in professional rugby and Olympic sports, in the NCAA and overseas, for the better part of decade. Then I burned out; had traveled too much, I saw too many different sports and too many different experiences and I burned out and realized that I was ready to be my own boss”.
On how he ended up launching his startup Gemini Sports Analytics:
- “I had the great fortune of building an incredible network during my travels. And in my travels, I saw patterns, and what I saw was that people were really dissatisfied with the tools that they did or didn’t have in their hands to work with data. We all know how much data teams are collecting, and there’s a real frustration where teams want to do more with their data and it’s awesome that teams are hiring so many data scientists. It’s really moving things forward, but hiring more and more and more data scientists would be kind of like asking Henry Ford for a faster horse, right?”
- “There’s a huge gap that we notice, where there’s a lot of people working with data or being data driven, so to speak, but they don’t have any tools in their hands to work with that data. They need something that’s complementary to the data scientists and makes their lives easier, allowing non data scientists to work with data”.
- “We love data scientists, we work with data scientists, we have data scientists, but just adding more and more and more of them isn’t the answer. I mean, this concept has been co-founded by Jose Fernandez who had a couple dozen data scientists and developers working for him at the Astros and he saw this problem”.
On the issues with some current analytics platforms used by pro teams:
- “We see some companies selling black boxes and black boxes are just wrong. I mean, I think one of the most exciting things that we can do to move the industry forward is building transparency into our data models because that enables coaches and other stakeholders to understand how predictions are made and to understand which variables affect different factors and help to minimize risk or make better decisions. When companies are selling black boxes or concierge services, it’s almost like a personal trainer who wants their client to get fat if they skip a few sessions so that they need them. We think the opposite way: We want to put a tool into users’ hands so that they have it and that they love using it themselves”.
- “Julien, we might be thinking of the same story where there’s a group of NHL players that played for one specific team about 10 years ago when the first monitoring technology came into hockey and that technology was black box. And I understand why that one is, but it was a black box and it just kept telling them all, your injury risk is elevated. And so, they all got paranoid and they didn’t want to use wearables anymore because of that experience. And not everything can be open sourced and not everyone can stop every day and understand why something is happening”.
On his view on the evolution of machine learning:
- “Fortunately, machine learning has evolved a lot recently. It’s a young field, I think XGBoost data model is only seven years old, and determining causality is going to be a big challenge because ML techniques aren’t necessarily effective at examining that. The more data that teams can aggregate the better and while it’s not feasible for it to be our value proposition from day one, I do think that eventually, teams will want to be able to tap into aggregated data from many, many different groups and see if there’s a kind of snowball there that can be collaborated on”.
- “The movie Moneyball is a big inspiration for sure. If we use the baseball analogy with nine innings, Moneyball was the first inning, right? And I think the second inning has come over the last decade or so, and the work of the teams like the Astros and the Houston Rockets & Sixers with Daryl Morey. Shane Battier, another pioneer in this space said to us that he thinks right now we’re in the third inning, right? Where teams are doing math and they’re hiring the computer people but now, we need to put computers in the hands of people who are not computer people, so that they can be empowered to make decisions”.
On his product and how it benefits the teams, and how it’s different from competing products out there:
- “Our web application is a coding-free predictive analytics environment. What we mean by that is it’s a dedicated platform where a user who does not write computer code can go in and perform data science without relying on a data scientist and without having to distract the data scientist from their projects, and that product doesn’t yet exist in the sports field. It exists in generic automated machine learning products out there, but there’s nothing built specifically for sports. Actually, we’re building our platform on what we think is the best generic platform out there, DataRobot so our computing power and our machine learning power is built on something that’s trusted by Goldman Sachs, General Electric, and the US Navy. And it’s got the cloud power of Snowflake, which is the best in class out there for that functionality. There are existing platforms that are offering analytics services, but nothing that is a tool in the user’s hands”.
- “It’s not a matter of what we can predict, it’s a matter of what teams can predict using our tool with their data sets.
- “A data model is only as good as what you feed it, as what you ingest, so there will be plenty of cases where teams don’t have the data that they need to answer certain questions. Fortunately, teams have a lot of questions, so there’s a lot of value to add anyways, including with publicly available data”.
- “We’re a dedicated analytics platform. We refuse to be a mile wide and an inch deep. Instead, we are going all in on analytics and that’s what we offer. We’re not dashboarding, we’re not data visualization, we are data analytics. And most importantly, we put a tool in your hands. You don’t rely on us for anything. There’s no third party. You don’t have to wait for us to run a model and get back to you. We’re putting the data models in your hands and you get to do it yourself”.
- “Everyone who’s a part of this company has been stuck with technology in their hands that was too difficult to use, and they ended up putting away. We know how easy this has to be to use, we know how clean the interface has to be and are on a mission to build to those standards”.
On the use cases that they are currently focusing on with their platform:
- “With our first three customers, what we’re calling our private beta, we’re working with friends, getting a lot of feedback, they’re in the NFL, the NBA and European soccer, iterating ahead of our launch later this year. The use cases so far are load management, how to decide which players need to rest and what kind of rest they need midweek in between NFL games. In the NBA, it’s around line-up optimization so how do you match up against upcoming opponents. In soccer, we are helping with scouting for summer transfers, Talent ID”.
- When you’ve got a 45,000 person scouting database, that’s a lot to ingest with one data scientist. So we’re making that available and then seeing what players fit their playing style and who could increase in value the most over upcoming years, because unless you are Manchester City or Bayern Munich, chances are that your business model as a team revolves around buying players at a low valuation and selling them at a higher valuation. That scouting question will apply to many sports, but we’re starting with it in soccer”.
On where they fit into the sports tech ecosystem:
- “We have many friends that work at or run AMS companies and we see ourselves as very complementary to them. Similar to the consumer packaged goods industry, where you have IMS or inventory management systems, and then you have analytics tools built on top of that, we want to be friends with every AMS company and build integrations with them, because the more organized a team is in terms of its databasing, the easier it will be for them to integrate with our platform or start using our platform. We’ll make data ingestion easier than ever, but if teams are already organized, that’s great. In terms of the ecosystem, look, this is a blue sky market, no one has done exactly what we are doing now. When you talk about use case trends, I think it will be where the market takes us. We’re extremely young. We have to walk before we can run. We refuse to go to market before we build a product that we’re really proud of”.
- “All of our founding group know firsthand how jaded we are as an industry with dissatisfying software or lack of tools. We’re not going to release something just to put it out there and hope that people buy it. We’re going to make sure that it’s something that we would use ourselves and that we love and that we’re really proud of.
- “I do see us positioning, deliberately for upcoming battles that are going to take place around who owns data. We want to work with leagues, where we want players to go to their agents, to go to their players unions, to go to their league offices and say, hey, every team should have access to this product because it raises the standard of athlete care and it democratizes athlete data, and as it should”.
On their business model and their API integration strategy:
- “We’re a SaaS product based on an annual subscription. One of the joys of being an early-stage company is we haven’t decided if we’re going to scale pricing based on users or based on the scale of the data ingested. We’re going to have internal analytics based on how many queries you are running and how much data you need to process through our cloud systems and all of those types of factors. If there’s a super-user, they might end up paying a little bit more, but most of our revenue will just be based on a flat annual subscription fee”.
- “Then some add-ons to the à la carte will come from additional APIs. We’re proactively offering the most important data ingestion methods that teams are going to want to make it automated. You go and you log into your Catapult account through our platform, and then every single time you push something through, it’s just ending up straight in your database through the Gemini web application”.
- “But then when we run into a baseball team that wants TrackMan, or we build Second Spectrum integrations for basketball and things like that, we’re going to build those on as we go, of course, as we add to our customer base in early days. And what’s cool about that is eventually we’ll just have a snowball with a big menu of offerings. It’ll be like a Cheesecake Factory menu where you just have that whole stack and you don’t know what to choose from”.
On their plans for the next 12 months:
- “The next 12 months, Julien, we’re going to build our product. We’re going to launch it. We’re going to work hard to make our first dozen customers very, very happy. The way we’re going to do that is by measuring twice and cutting once and by moving slowly now so that we can move fast later”.
On the types of teams and leagues that he believe will be a good fit for them:
- “If you look at like the National Hockey League right now, they’re upscaling analytics so fast that we know that they’re excited about this, whereas baseball, baseball is probably the only league with more than one or two teams that probably don’t need us right now. Eventually, we’ll have something built that they’ll need, but right now they’ve got a lot of fire power and the Dodgers, like they’ve got enough, the Red Sox, they’ve got enough, but hockey, I think we could help every hockey team right now so that will help guide our journey, I think”.
- “If you have 20 data scientists, you might not need us right now but we can probably help. If you have zero data scientists, you’re probably not ready for us. If you have between two and 10, that’s probably a great place to be”.
On his take on the NFT space and the question around players”:
- “Well, that’s an impossible question in a very fun way, because if I say, oh, we’re all in on NFTs Web3 and metaverse, then I’ll make a bunch of people think that I’m hip and modern, and other people will roll their eyes. And if I say, I think that’s a trend and I think it’s Pokemon for adults, then some people will say that I’m missing the cutting edge, right? Here’s what I’d say is that we haven’t found the functional use for those technologies yet. I think that they have a place in fan engagement and maybe fantasy and betting, but it’s not there for actual athlete performance or athlete facing data operations”.
You may also like: