Listen now on Apple | Spotify | SimpleCast
The US passed the largest climate bill in American history this summer: The Inflation Reduction Act. With new energy sources, will come new challenges for our already-struggling grid.
Now more than ever, innovations in AI and machine learning are so important for stabilizing the grid.
Today I’m talking to Sean Murphy, the CEO and Co-Founder of PingThings, Inc., a software platform for high-frequency time series data bringing AI and machine learning to some of the largest utilities in the US.
PingThings is backed by GE Ventures and is also spearheading the Department of Energy’s National infrastructure for AI on the Grid initiative.
Previously, Sean built Full Stack Data Science, a consulting firm that focused on bringing AI, machine learning, and data science into the enterprise space– specifically in government, finance, consumer, education, and healthcare.
He also spent over a decade as a senior scientist at Johns Hopkins University Applied Physics Laboratory.
With such a wide breadth and depth of experience, we’re excited to speak to Sean today to learn more about his innovative platform.
Sean Murphy: I’m really glad to be here, and it is an incredibly exciting time for the energy transition for the future. The money that the government’s spending and investing is a big deal, and it’s going to move the needle. You’re going to see despite whatever the rest of the economy’s doing, energy’s heating up, and I think that’s very good news for innovation.
[4:10] How did you get into data science? When did you know you would be a data scientist?
Sean Murphy: Oh, that’s an excellent question. So my background, I was actually pre-med, bailed on med school, and I have degrees in math and electrical engineering. And then I did my grad work at Hopkins and biomedical engineering, and I worked with data. I was computational, I have a numerical background in undergrad, I realized even all of the sciences were really just math and all of that was computers, it was all simulation, it was all numerical analysis. And so at Hopkins, a lot of the work that I did, everything was simulated, everything was algorithm development, everything was processing and understanding data. So I saw the sea change from the East Coast. I recognized that the world was turning to data, and worked on a lot of early machine learning approaches for anomaly detection and time series at this point almost 20 years ago, which I hate to say because now I sound old, but it’s teaching machines and computers to learn from data makes a whole lot of sense. So now it’s slowly rippling out and industry after industry, after industry adopting those ideas and those techniques.
John Belizaire: So it sounds like it was a natural progression. The more you spent time with data and understood how valuable it can be to solve problems or get insights, you became more interested in the space and then kept with it.
Sean Murphy: Absolutely. And the fun part is that there’s a famous quote, You cannot improve what you cannot measure or you cannot improve what you cannot quantify. I think that applies obviously to machine learning and et cetera, but it also applies to the individual level. If you are trying to improve as an athlete, as an artist, or as a student, it’s the same idea. We need these quantified metrics by which we benchmark and measure how we’re doing on a day-to-day or week-to-week basis. So yeah, data it’s the record of our accomplishments to some extent. So it’s critical.
John Belizaire: And more data is becoming increasingly available to us. It’s just exploding in the world.
Sean Murphy: I think that is one of the reasons why we keep seeing positive economic outcomes because there is technological innovation that’s happening and it’s allowing us to do more with the same amount of resources. And just as software was supposed to eat the world, it’s now software being trained on data, the original software. So, it’s inevitable.
[07:36] Let’s talk about your current venture and focus. I think it’s very exciting to use data and software to create a more predictive grid. In fact, you call it PredictiveGrid software. How does it work and what problems can it solve for our current grid system?
Sean Murphy: Yeah, so the PredictiveGrid, and I got to say, my company is filled with really brilliant computer scientists and data scientists. We don’t have anyone in marketing or branding, so I apologize for the names. And if you go to our website, it’s terrible. I built it. So, it’s a dumpster fire. You have my apologies. We offer a cloud-based time series data platform, and it’s really designed to manage, analyze, and apply machine learning to time series data. And when I say time series, just to be clear, to different people, it means different things. I come from a physical sciences, and physical engineering background. Time series is always, I have a timestamp and I have a measurement. And that’s a measurement of typically the physical world. So it’s a floating point number and it’s describing the behavior of some sort of asset. And with a lot of sensing, the time series data is regularly spaced. It happens once a second or once every nanosecond.
So, we built a platform that was designed to handle just unheard-of amounts of time series data. It’s a distributed computing system, it’s horizontally scalable. It was built that way from day zero because we looked at the grid and we looked at the distribution system and we said, Hey, if we, or preferably someone else adds higher frequency sensing to the distribution system, we are looking at tens or hundreds of thousands of high frequency sensing that’s sampling the behavior of the grid at 60 or 120 or 240 samples per second per channel. And we’re going to have a hundred thousand of those sensors. We need to be able to ingest that, to store that, to query that, to analyze it, and to really ultimately create value from that data to be able to better manage the grid.
That drove the design requirements for the platform. It took some time. But I’m very proud because basically, we’ve been ingesting and processing transmission systems’ worth of data since 2019. Millions of measurements per second per customer. Our reliability record’s phenomenal. And we keep making the platform faster and faster and faster so that we’re processing tens of millions of measurements per second per node. So our performance, our analytical capabilities, they just keep getting better and better and better. It’s fun because when you look at some of the other folks in this space, their view of time series involves it’s a time stamp and a tweet, and it’s a timestamp and an image. And for us, we are hyper-focused on time-stamped quantitative measurements of physical systems. That allows us to really just add an insane level of optimization and get speed and scale which is pretty much unheard of.
So it’s been really fun because when I was at Hopkins, we had data sets where we had a polysomnographic study, which is really, it’s a sleep study. So someone goes in, they get hooked up to an EKG and EOG and EMG and all sorts of physiological signal recording, and they sleep for eight, 10, 12 hours. And so you have 12 hours of each signal is a thousand hertz, it’s a thousand samples per second, and you have 10,000 of those patients. And back in 2003, I had to figure out how to process all of that data. And I really wish I had my platform back then because the tooling for time series data it’s not fantastic. It’s one of these things that’s been left behind. So we’ve been trying to fix that.
John Belizaire: So if I were to use the analogy that you just went through here, the sleep study and the data points and pieces of data you’re collecting from the individual that’s asleep to try to understand what’s happening with them, what’s the equivalent for the grid and what is the predictive software doing for the grid?
[12:40] What is the predictive software doing for the grid?
Sean Murphy: Yeah, it’s a great question. So the equivalent or the analogy for the grid is each patient would be an asset. It could be a transformer, it could be a stat column, it could be a transmission line, it could be a cat bank, it could be a recloser. Basically, any piece of a smart asset that’s already deployed on the grid. And we want to understand the behavior of that piece of equipment. We want to understand how the grid is doing its fundamental job. And that is for a transmission system, it’s transmitting electrical power, and we want to do that reliably. We want to do that resiliently, and we want to do that as efficiently as possible because if I can decrease the waste from 7% to 6% or 5%, that’s a tremendous amount of greenhouse gas savings.
[13:51]: How have utilities responded to your software so far? Have you noticed any change to their approach to renewable energy using the software or any insights you can share with us here?
Sean Murphy: Yeah, it’s actually fascinating. Utilities or regulated monopolies, the incentive structure for a typical IOU is a very complex beast. There’s a lot of their regulatory requirements, there’s a very complex set of rules that they have to follow that they’re trying to optimize. So a lot of times their behavior doesn’t look, just from a layman’s perspective, from the outside, it looks a little bit like they don’t care or they’re not reacting to incentives as one would expect. And so in some ways, they have lagged in the technology of adoption. And in the case for us, it’s actually crazy fun because if you compare the predictive grid to some of the legacy historians that are typical, they’ve been deployed for years at some of these utilities that we work with, it’s like going from the stone age to the Jetsons because the simple, one of our first customers, they were using a historian that’s the 800-pound industry gorilla.
And just basically to sell our platform what we did is we had one of the engineers do a screen recording and they recorded using the legacy historian, they said, Hey, I want to query this data stream and they want to retrieve that data. And they hit record, and then they went to lunch and they came back and 35 minutes later it finished. And with our platform, that query for data returned in about a hundred milliseconds, so faster than really the human eye could register. So basically instantaneous. And when you go from having data that basically you just can’t get access to, so you don’t use it, to having data at your fingertips, it starts to transform everything that you do. Oh, there’s an outage. Let’s see if we can quickly determine where that outage is. Done. Okay, we know where the outage is. So now all of a sudden, well that was easy.
Now my response time, which if I’m a utility, my response time’s critical to events. It’s something I measured on. And so now my response time’s dropped. This is amazing. Let’s do more now. Let’s start automating the localization of those events so that we can respond even faster. And then as we keep going along that path, it’s forgetting about detecting the event, let’s predict that the event’s going to happen so that it never happens in the first place.
Joh Beliziare: So interesting. I want to shift to talking about another aspect of your company that’s not software, it’s community. And it seems like one of the main pillars of PingThings is building a community where engineers, analysts, and other stakeholders can exchange expertise, especially if you are dealing with these government organizations, these utilities that aren’t used to dealing with data, they haven’t spent 20 years like yourself living and breathing it and getting control over it.
[19:58]: It looks like your group hosts training and hackathons. And you’ve previously, you did this where you built a whole community of 10,000 plus data practitioners. What led you to realize that community was important and knowledge sharing is important in the work that you do at PingThings?
Sean Murphy: Yeah, absolutely. And this is an ideological, a philosophical issue. So you can, comparing opposite ends of the spectrum, one is you can build these silos where you erect or towers where you have these walls and you do your thing and you do it very well. And it’s like you don’t want to play well with others and you want to protect everything you have. And then the other mentality is, I want to build something at horizontal layers. So for PingThings we want to make working with time series data at any scale easy and simple and fast. And so I don’t care who makes a sensor, I don’t care who made the smart asset, I don’t care whose grid it is. I just want to enable the use and value creation from all of that time series data. And so I’m a horizontal layer, please build on top of me.
We’ll build some stuff, third parties will build other stuff, and customers will build even more stuff. And so I think one of the issues that have existed in the utility space is a lot of utilities, they’re built, they’re full of silos. They’ll have sensors that monitor the grid, and even though those sensors might be installed next to each other, the data flows into a separate system. It’s siloed. You can never refuse that data. And those groups don’t communicate, and everything’s built as these silos. And I don’t think that’s nearly as effective. You don’t get innovation happening at the same clip when you have more open data sharing, collaboration, cross-pollination, It just works better. But it is a fundamental cultural shift.
John Belizaire: Yeah, that’s so interesting. And so by creating community, you are able to infuse that cultural shift. An educated customer is the best customer. So if you have communities where customers can begin to educate themselves and also people learn from a platform like this, I can imagine that that could be very powerful for the growth and success of the business as well.
Sean Murphy: Yeah, absolutely. When I’ve thought about this, and I think this permeates literally everything from industries, government, and countries to individual personal life paths, it’s one of two mentalities. You either are fighting over a fixed resource. I will stab you in the back to get a slightly larger piece of the pie, or you believe I’m going to grow the pie. So I don’t care how big your slice is because I’m making the whole pizza bigger.
We believe, and maybe this is naive, but I don’t know, a force of will, we’ve been making it work, that we can grow the pie. And I think the utility industry for a long time has been a folks fighting over what they perceive to be a fixed pie.
[24:08]: What’s the role of open-source data? I noticed that that’s a big part of your approach and platform. Why is that a part of it? I thought you were going after the sensors on the resources.
Sean Murphy: Yeah. And actually, I would say being able to share data. A lot of times when you work with utilities, their data is protected, it’s considered confidential. And that creates a lot of different encumbrances. But even the utilities, even though it’s their data, oftentimes utilities, they want to work with this utility or this other utility, or they want to work with the university, or they want to work with this think tank or consultancy. And the ability to share data very quickly and easily with the people that they choose is incredibly needed. Because right now, when we first got started, if you wanted a year of data from a utility, it might take three years to gain access to it, because you had to go through NDAs and the legal department, and then you had to export that data from some ancient data system.
You couldn’t collaborate easily. And then we also think through one of our government projects, we’ve deployed a number of sensors on the grid, and those are our sensors, and we stream them into an open cloud-based version of our platform, and we make that data accessible. So because we own the sensors and we’ve deployed the sensors, we stream that data and we make that data accessible to anyone. So I think there’s a place for open data in this industry. I think that actually even more important is the rapid ability to share the data because, for critical infrastructure, cybersecurity is always going to be an issue. But if you lock data down so much that you can’t work together, you’ve already lost.
[32:54]: You’ve founded four companies. What advice would you give to entrepreneurs?
Sean Murphy: My number one rule of startup club is just don’t die. Do not die. Staying alive. Maybe I should put it more, stay alive. Never run out of cash. Cash is king, do not die. And that rule right there will go a very long way. But I think the other thing that I would tell them, and I would sit them down and I would say, look, this is hard, it’s brutal. There are going to be emotional ups and downs, the likes of which you’ve never really experienced before. That’s going to happen sometimes on a daily basis. And you go back to rule number one, don’t die, stay alive.
And it’s also, the other thing is, I had this, I come from a, both my dad and my mom were relatively poor growing up. My dad was gifted, he’s a pretty smart guy. And he, through some good fortune, got into the middle class. But when you grow up poor, when they did, you have the scarcity mentality and you have a model of the world, which I’ll say isn’t necessarily fantastically useful for an entrepreneur. I went in to start a plan and I was like, Oh, this is a meritocracy. It doesn’t matter where I’m from, it just matters what I can accomplish.
And it’s true, but cash is king, and access to capital is not a meritocracy. Access to capital is a, it’s a socioeconomic game. There are very strong network effects. You see lots of companies where you’re just scratching your head, How did anyone fund them? I think the second thing I would tell them is another pretty basic rule after the first, don’t die, life’s not fair. And it flows back into rule number one, just don’t die.
[36:05]: I want to talk a little bit about the role of AI and the grid for a second. And I want to put it in the backdrop of we’ve just got this Inflation Reduction Act that’s clearly, incredibly compelling. You’re talking about approaching $370 billion of investment into clean energy and clean energy incentives. That’s going to drive much more investment in the space. So suddenly the grid is going to see a whole new growth in generation and new growth in generation, intermittent generation, has all sorts of problems associated with it, as well as advantages. How can AI help there? It’s going to create stability issues with the grid. What can AI do?
Sean Murphy: This is a fantastic question. Let me go off on a slight tangent to then hook back into the question. The original grid is a physics-based system. There are some giant generation facilities like a hydroelectric dam, water rushes through, it spins, large hunks of metal, and generators turn. And we have a 60-hertz alternating current power system because there are giant hunks of metal that are spinning around and around and around, and they’re generating electricity and power. And I have equations that tell me exactly how that system is going to work, how those electrons are going to flow, and how I’m going to be able to deliver power. So we’re ripping all of that out.
And instead, what are we doing? We’re taking that one giant generation system, that source, that hydroelectric, or that coal fire plant, or that nuclear reactor. And we’re replacing each one of those with all of these small little solar farms and rooftop solar and wind farms. And all of those devices don’t generate power typically by spinning hunks of metal. And that power then has to be adapted to the grid. And instead of one source, there are now a thousand different sources and they all have to be coordinated. So the complexity of the system has increased exponentially. We’ve basically gone from analog generation to a silicon generation, and we’re then trying to map that back to a legacy grid that once the old-style generation because those are the requirements under which it was built. So when you have a physics-based system, which is governed by reasonable equations, you can build models, you can solve those simulations, and you can understand how the system behaves.
When the complexity goes up exponentially, each individual generating power-generating device is actually controlled by a computer. It’s a black box and it behaves in a way that is not determined by any physics we understand. The only way you can monitor and control that grid is through data because the physics doesn’t work anymore. You can’t have an equation that tells you exactly how that uber-complicated system is going to behave. So I would posit that without data, we’re not going to be able to run this new grid. We have a legacy grid sandwich, and on one side is silicon-based generation. And on the other side is this complex new silicon consumption like electrified transport and electrified industry. And the complexity is skyrocketing. And the only way you can deal with that is with data. And using ML, training AI algorithms to respond to that data is really the only way you can work with data at that scale and at the needed velocity. So I would say that AI and data are going to make the grid we want to build possible.
John Belizaire: Got it. That’s a great way to show the difference, the dichotomy between the legacy and the new, I like this physics versus silicon. It’s super cool. Thank you.
Sean Murphy: Yeah, absolutely. And it’s important because renewables, they’re great, we’re getting energy from the sun and there’s no carbon. This is fantastic. But nothing is just all pluses. There are pluses and minuses. There are downsides. Well, how does it perturb the existing infrastructure? Can we manage and control this and reliably supply a society, a civilization that’s built on 24/7 access to electric power?
John Belizaire: Yeah, exactly. And going through this incredible transition where a good part of our economy will become electrified, and the role that plays on something that we just took for granted, the grid, it’s becoming more the vital part of our lives to some extent.
Sean Murphy: Absolutely. From an engineering perspective, these are all trade-offs. We are very focused on carbon footprint. We are very focused on clean energy and those things are great, but… there are going to be growing pains in that process that I think unfortunately really haven’t gotten enough press.