EP 185 - Bringing AI to the edge with analog computing - Dave Fick, co-founder & CEO, Mythic

Podcasts > Ep. 185 - Bringing AI to the edge with analog computing

Ep. 185

Bringing AI to the edge with analog computing

Dave Fick, co-founder & CEO, Mythic

Wednesday, August 09, 2023

In this episode, we talked with Dave Fick, co-founder and CEO of Mythic. Mythic has developed analog computing technology to deliver high-performance AI processors that are ten times more power-efficient and cost-effective than digital solutions.

In this talk, we discussed why analog computing is uniquely well-suited for machine learning on the edge, due to low energy consumption, low latency, and high capability. We also talk about several industrial applications that are using analog computing today.

Key Questions:

● What is analog computing?

● What is the difference in form and cost structure for analog and digital solutions?

● How does analog computing deal with 'messy data'?

Transcript.

Erik: Dave, thanks so much for joining us on the podcast.

Dave: Thank you for having me.

Erik: Dave, I'm really looking forward to this one. This is a topic that is quite new to me. And I always feel like when I'm hosting this podcast, it's a little bit of my advanced studies. It's kind of my continuous education in technology. This is a topic I'm really looking forward to getting into. But before we do that, I'm interested actually in understanding how you came to set up the company. It's very ambitious, right? I mean, it's quite an interesting technology domain. You've basically set this up immediately after graduating. So if I see your CV, it's kind of like I did some internships, got my PhD, and then set up a company and raised $165 million. So it's successful. At least, successful in the financing. I know success in business can be different from success in venture. But I would love to understand what led you at that point in your life to feel like you could take on this very ambitious challenge.

Dave: Yeah, absolutely. We came out of the PhD program. Actually, I came out of University of Michigan in Ann Arbor. Michigan has one of the best chip PhD programs. They worked with Intel, I believe in the '90s or early 2000s, to come up with what would be a really practical program. And so Michigan did a great job of preparing students for not just what is a piece of the chip design process, but also how you do the whole thing end to end. And so doing the program there, I came out, with experience building, I think almost a dozen chips by the time I got done with the ones that I had led and the ones that my colleagues had done and I assisted on.

My co-founder, Mike Henry, came out of Virginia Tech. He and I had this idea of applying for SBIR projects, which is a government research program, and starting a company to develop new technology. Coming out of the PhD program, you're used to working with faculty. They're giving you the projects to work on. This is our first chance at really pursuing technology that was directly in our interest, applying for the grants ourselves and pursuing it. And so we were able to win two contracts — one for analog compute for GPS signal acquisition and one for neural networks. We were able to subcontract that to the university so that we were able to get a little bit of office space, some time from some master's students, and work in the university environment as incubator.

Both of those projects did very well. They both showed about 100x improvement in performance and energy efficiency. We found that the neural network space is blowing up. We started in 2012. And so 2012 is when this foundational paper, Alexnet, had come out of the University of Toronto, I believe. We were accelerating that algorithm. In that time, this neural network space really blew up. It went from something that was a curiosity in the 20 odds and became a mainstream research space in the 2010s. And so since then, it's been quite the experience building a technology and a company around it.

Erik: Okay. So good timing. Also, it's a luxury but it's also really a success factor of being able to do something that you're personally interested in, right? When you talk to founders that have a passion, they tend to stick with the problem and be much more successful.

Dave: Yeah, anytime you're building a company or doing a new technology from the ground up, you're always going to run into challenges. And so the more you're willing to push through, the more late nights that you're willing to spend and think about these hard problems and solve, the more likely it is that you're going to succeed. Definitely working on interesting, exciting technology has been helpful to us but then also for recruiting and finding investors. People are excited about the technology that we're developing. That makes a big impact in our success, beyond just our own enthusiasm but all the people that we surround ourselves with.

Erik: Well, let's get into what you're actually working on. Give us the 101 first. What is analog computing, and how does it differ from the more traditional computing approaches?

Dave: Analog computing — when we think about digital, digital is about ones and zeros. So if you want to represent a number like 125, you actually need at least 7 bits to represent that. When we talk about bits, those are effectively digits that can be represented with only one or zero. So binary. If you want to go up to larger numbers like millions or billions, then you'll need more bits of 20 bits, 30 bits. Physically, what that means is, for each one of those bits, you have to have a separate wire. Those separate wires go to separate transistors, which are devices that can move the wires up or down. Anytime you're moving a wire up or down, you're going to expend energy. And so when we think about, hey, my processor draws 100 watts or something like that, what the 100 watts is doing is it's moving a bunch of wires up and down. Every time you move a wire up or down, that uses up a little bit of energy.

When we do analog computing, what that does is, instead of having the wires representing one or zero, it could represent maybe 27 bits of information on a wire. 27 would be 128 million different values. Instead of having just two values, you could have 128 million values on that wire. By compressing all the information into a small number of wires and a small number of transistors, you greatly reduce the number of wires that are moving. So now you've dramatically reduced the energy consumption. But you also reduce the distance that information needs to travel. The speed of signaling on a wire is I think less than half the speed of light. And so that speed of light seems really fast. But when you go and you're operating at gigahertz speeds, it's actually measured in centimeters as opposed to microns. Normally, when we're moving signals across a chip, you're thinking about, how many microns am I moving in that? How much time is that going to take? Even at light speed, that's what slows you down. That's why you can only operate at, say, five gigahertz instead of five terahertz.

The information density that we're able to achieve with analog allows us to save energy. It allows us to move faster. That allows us to build basically smarter systems, more powerful systems, more efficient systems. Now, the drawback is, when you're using analog, if you have 128 million different values on a wire, it can be difficult to determine what precise value is on that wire at any one time. There's going to be a little bit of like, there's always noise in analog signal. So if you turn up your speakers really loud in a stereo, you'll start to hear a little bit crackling. There's always what's called a 'noise floor.' No matter what system you build, there's going to be a tiny bit of noise in the end.

And so you can't use analog computing to, say, run Excel or anything where you'd want to have an extremely precise value at the end. Because as soon as it's analog, there's going to be some amount of stochasticity to it. But in neural networks and other signal processing applications, these require not precision but very good estimations. I think we'll get into it a little bit more. All of the most difficult computing problems — I shouldn't say all, but many of the most difficult computing problems of today are taking some sort of signal like a camera or a microphone, and trying to figure out what was seen or what was said. Or even if you think about somebody takes some action in front of you, you're trying to interpret what they're doing. That might be not just vision. It might be trying to figure out their body language, their intent. These problems don't have the same level of specific answer like an Excel spreadsheet would. It's a predictive problem. And so this is a domain where we can really leverage the analog computing.

Erik: Okay. That makes sense. So if you're doing a financial analysis, there is a correct and an incorrect answer. You want to use digital computing so that you can have a structure that provides only the correct answer. If you're using analog or if you're dealing with any kind of machine learning, you're dealing with a probabilistic answer. Therefore, analog is fine. Because anyways, the data allows only a probabilistic answer, and analog can get you to that answer quicker, cheaper, with less energy, and then I guess a sufficiently high success rate. Is the success rate for analog, does that end up also suffering to some extent versus digital for machine learning, or are you able to match or exceed the accuracy also of projections?

Dave: We can definitely match or exceed. There's a couple different aspects of this. One is, how powerful is the neural network in the first place? Smaller ones tend to be a little more fragile. They may not be able to have enough extra capacity to take on the analog nature. But most modern neural network or almost every new modern neural network will have sufficient capacity to work with analog computing.

In practice, what we see is, if you're using a computing technology that allows orders of magnitude improvement in energy efficiency and performance, then you can apply a much bigger neural network to the problem. And so in practice today, our customers are typically not able to run the latest and greatest neural networks that are out there in research. You'll see research papers with enormous networks that require a multi-thousand-dollar GPU to run. It won't be able to run real time. It's certainly not on the hardware budgets that device makers are working with today. You can't put $1,000 GPU into a $250 consumer device, for instance. And so the question becomes, if I'm going to buy $10 worth of silicon and put it into this device, what is the biggest neural network it can put in there?

There has been a whole class of research around that problem. There's networks like MobileNet and EfficientNet that are specifically targeting for these embedded systems. How do we fit a neural network that's at least moderately capable into that budget? Both from it, there's a budget from a cost perspective, and then there's also the budget of just cooling a system. You can't draw 500 watts inside of the space of a small camera, for instance. And so what we allow them to do is upgrade from these really small neural networks that were designed for small embedded systems, then upgrade them to more advanced neural networks. That improvement in just the model that you're using provides a big increase in accuracy.

Now, what we've also seen is, for digital systems, many of the competing products out in the market — whether it's a specific accelerator or even like an Nvidia GPU — there will be a lot of focus on model compression. That's what it's called. So taking a neural network, it might have, say, 10 million weights in it. So neuron parameters. They try to trim away, say, 90% of those or 99% of those in order to make the model smaller and more efficient to execute. Many times, what we see is, the amount of printing that device makers are forced to do to try to fit these models inside of an embedded system ends up being a huge loss in accuracy and far beyond what a little bit that would occur from switching analog signals.

Erik: Okay. Interesting. So if it's let's say Google or somebody that wants to build a large LLM, they'll do this in a data center using maybe Nvidia A100 chips. Because they can put the data somewhere, center somewhere where energy is cheap and space is abundant, and it's highly centralized. But if they then want to deploy an algorithm on anything that's embedded in a normal device — I guess whether it's a computer or an IoT device of some kind, a car or anything like this — then that approach is not going to work, or they're going to have to trim down significantly to the point where the results start to suffer.

If we then look at the device specifications in terms of — energy, you've already identified. If we look at the form factor, the size, if we look at the cost, how do those tend to compare? I guess you're comparing a bunch of variables so it's not kind of an apples-to-apples comparison. What tends to be the difference in terms of form factor and the difference in terms of cost structure for an analog versus digital solution for a, let's say, sophisticated IoT device?

Dave: Sure. Let me just add something about the difference in costs in terms of the application. I think I was reading an industry analysis report recently that said that ChatGPT-4 — which is one of the models that's getting a lot of attention right now over the LLMS — to run that model required two large servers. Each cost more than a quarter million dollars. From a form factor perspective, it's probably 16 uto, half million dollars. The latest and greatest research that you can find and the things that grabbed the headlines are these enormously expensive systems. But when we think about trying to take that research and impact industry, many of these systems that we're working with are like a camera system or a small robot. It might be able to draw somewhere between 5 and 20 watts for the total power budget. The cost of the hardware, like the computing hardware, probably needs to be less than a couple $100. I mean, this is pretty uniform around the industrial space, for instance. I think we get a pretty consistent picture there. On the consumer side, it's even more restrictive because consumers aren't typically — except for Apple devices. They're not paying thousands of dollars for one system.

In terms of what implementing an analog compute accelerator versus a digital accelerator would look like, from the outside, it's actually very similar and the same. Our chip works as a PCI Express attached accelerator. Meaning, kind of like how you'd have an Nvidia GPU attached to a PC. In our systems, you'll have an SOC, like an annex P platform, for instance, or a TI platform. It'll have a PCI Express port which will talk to our chip. Inside of our chip, there is analog computing engines, but the top level of the architecture is actually digital. And so it's not that the entire chip is analog. It's that there's certain key operations that map really well to the analog domain. Those are these matrix multiply operations. When you think about neurons, you have like your brain has billions of neurons. In a neural network, they're in these what's called layers. You only have like maybe 64 neurons in one layer and a couple thousand neurons in another layer.

In our analog compute engine, we can store 1,000 neurons. Each can take up to 1,000 signals from other neurons. It can calculate in one shot a quarter million matrix multiply operations. Actually, I should say multiply accumulate operations for a matrix multiply in one shot. That matrix multiply operation, that's done in the analog domain. But the signaling between these analog compute engines, the signaling between the neurons is digital. And so the way that we do that is, on the input to the matrix multiply, we convert from the digital domain to the analog domain. Then we do this really powerful complex operation in the analog domain. Then we take the result out by converting it back to digital. What that allows our system to do is, we get the benefits of analog computing — the efficiency, the performance — for doing those big matrix multiplies for the neurons. But we also get the benefits of digital communication — storage and programmability. So we can map any neural network to the system. Because, ultimately, there's a software that's managing each of these analog compute engines. And so from the point of view of the software, we're a grid of compute that just happens to have a very high-performance matrix multiply operations built into it.

Erik: Okay. Got you. Okay. Great. The usability is similar. The form factor and the cost structure is similar, but then the capability is much higher for certain types of problems. Let's get into the types of problems that are a good fit here. So your team shared through via email a few cases of drones, industrial automation, video security, smart home, AR, VR. So here we're talking about messy data. I guess that's kind of the theme here. So talking about video, images, audio, and then making predictions based on those. What for you defines a great use case? Maybe you can illustrate it with a couple if you have to keep customer names proprietary. But if you want to illustrate with a couple of examples.

Dave: Yeah, we're actually like a general-purpose signal processing system. So the other research project that we'd worked on in the early days was GPS signal acquisition. That used a very similar technology. We could actually implement that under analog compute today. In that case, it's here receiving the signals from a satellite that are below the noise floor. Meaning that if you're looking at the signal coming in from the satellite, it just looks like noise. You can't actually see any signal at all. That's recovered through these massive million-term matrix multiply operations. You have to do a search to find that satellite signal by comparing different time stamps with a pattern.

Any type of signal processing application can be a good fit. Some of the ones that we talked about today, or some of the ones that we've talked about neural networks I should say, are object detection or image classification where you're looking at a picture when you're trying to find, say, people or objects, or you're trying to find defects in some sort of product. We can also do signal processing where we're maybe upscaling an image. There's interest in the television space, for instance, where you start with maybe a very low quality. Think about your cable box. It has a lower resolution, a signal, highly-compressed because the channel has limited bandwidth at sending. You want to upscale that to your 8K TV. And so you can use neural networks to upscale your image in real-time from that highly-compressed low-quality, low resolution signal up to 8k, remove the compression artifacts, provide the higher resolution and so on. There's opportunities in, say, radar and sensor fusion. Again, looking for like a Lidar system, you might be looking for objects in 3D point clouds or trying to figure out where walls or lanes are for self-driving cars for automotive space. I think today we're very focused on computer vision as a starting point. We have our roots in signal processing and see a lot of opportunities in that space as well.

In terms of what we're working on currently with customers, a lot of those customers are looking at more classical computer vision. So if you're in a factory, you'll have some product that you're sorting into maybe grades. If it's produce, you may be trying to sort. Is this a really high-quality apple or a low-quality apple? For instance, this one is something that somebody would pick up at a store and want to buy, versus this one might be better for applesauce. These machines need to move extremely quickly. You can easily calculate. In terms of the processing speed, if I can process this instead of 10 milliseconds, it'd be five milliseconds, how much money does that save? If I can improve the accuracy from 90% to 95%, how much money does that save? And so the industrial space has a lot of opportunities for applying these technologies earlier. Consumers might not see them. They're more affecting the supply chain than affecting the things that they're interacting with on a day to day.

I think in terms of consumer applications, we've seen the smart doorbells and security cameras. At least the ones that I've worked with at home, the ones I've had at my home, there is some attempt to put computer vision into those systems. But at this point, the processing capability that they've been able to implement, given that cost and form factor, has not been high enough to allow the really high accuracy neural networks. That's something that we're hoping that we can change so that when you've got a face detection at home — I think my security camera recently told me that my brother is in my house. He's in Minnesota right now, and I'm in Texas. I was like, okay. Obviously, he's not in my house. And so can we make it so that this technology is actually reliable? That's going to require an order of magnitude improvement in computing performance. I think that's something that these analog computing technologies can help rectify.

Erik: It sounds like, just based on the examples that you've given, the industrial machine vision there, you might think of something like $10,000 per device maybe with a subscription around that. The manufacturer is happy to cover that cost because there's a strong business case behind it. If you talk about a consumer application, you're talking about maybe a few $100 per device. Or drones for that matter. I guess you have industrial drones where you could be talking about tens of thousands or hundreds of thousands of dollars. But then consumer, you're, again, in those hundreds of dollars of range.

What is the timeframe that you think is reasonable for getting from an industrial edge computing solution today down to a consumer device? I guess a lot of factories would rather just take an off-the-shelf camera and be able to work with this instead of paying for a really more sophisticated solution. So if you could also get to the point where you can embed it in a more or less off-the-shelf device with a very affordable micro compute, I'm sure you'd have a market in industrial as well. But what's the timeframe for getting down to that cost level?

Dave: We're working with industrial customers today. I think our second gen is going to have a big leap in terms of cost and energy efficiency. And so the second gen that we're developing will be available in 2025, 2026. It takes another year or two to actually get to the market because that needs to be built into those consumer devices. And so I'd expect to see that in 2026, 2027 timeframe in the consumer space. But for sure, the industrial space tends to be, from what I've seen, more focused on saving money. Usually, it's about saving money somehow. Then the consumer space is more focused on convenience, or novelty, or entertainment. That tends to be a little squishier. It's what people will pay for as opposed to a straight formula. And so I think getting the cost down is very important there to help move the needle. I expect that will happen in the next couple of years here.

Erik: I guess one other important factor there is scale, right? The chips that are in our iPhones would cost a lot more if Apple wasn't selling 100 million of them a year. If they were selling 10,000, the price point wouldn't be where it is. If you look at analog, is it using the same basic manufacturing capabilities as the more mass-market chips, or do you have to also innovate and figure out how to scale the manufacturing technology?

Dave: No, we use standard manufacturing technologies. The only unique part is we use embedded flash, which is a little less common. But it's still not a technology that we've created. It's something that's available at almost every foundry and a subset of processes. For sure, if we were selling as many chips as Apple, we'd be able to negotiate better pricing with our suppliers. We'll get there eventually. But we're using standard technologies today. That's one of the key advantages for us.

The other thing is that, because we have these advancements in energy density and performance, we're actually able to use older process technology. Our first gen is in a 40-nanometer process, which is that was modern. I want to say like 2006. That was a while back now at this point. But because we have these huge increases in energy efficiency and performance through the incredible information density, that allows us to use these more cost-efficient processes, like 40-nanometer. Our second gen is going to be in 28-nanometer. We kind of have a reset on Moore's law, which has been important. Because the newer process technologies today — I think I saw Apple is using like 3-nanometer — those are insanely expensive. For a startup or even a large company to try to do a new technology on, it would be very challenging just from an implementation cost point of view. And so we're very fortunate to be able to use and be very competitive, given the price of technology that we're in today.

Erik: I noticed one topic that was not covered in your market which I thought was interesting, which is the automotive industry. Because if you talk about machine vision, okay, maybe this is a bit more of the future of the automotive industry as opposed to where it is today. But that seems like an important topic. Then the low latency value proposition seems quite strong for this case since there's life and death issues regarding information processing there. How do you view automotive? Do you see that as an industry that has high potential, or do you see that as a bit still more like 5- to 10-year timeframe before they're really investing heavily in ML on the edge for vehicle operations?

Dave: It definitely feels like that space is increasing. The number of cameras on every vehicle is doubling from generation to generation. And so we see that as a space we want to go after. We have some compelling advantages — as you mentioned, the latency advantage which we haven't talked about yet. Just the cost advantage, being able to put AI computing in every single camera, and streaming back high-level information as opposed to trying to stream back super high-resolution video from every camera back to a central processing unit, that reduces the complexity of the system quite a bit trying to being able to spread out the compute across the cameras. And so we see a big advantage there. We haven't gone after that space yet just as a startup. Anytime you go after automotive or medical, anything where there's safety critical aspects, it greatly increases your costs and your time to market. And so we're starting on the more standard industrial spaces that can move quicker. Then once we have a firm footing there, then we'll move into the automotive space as well.

Erik: Yeah, got it. Well, let's touch a bit more on latency. Because often, when we talk about latency, we're talking about transceivers and the data moving from, let's say, the sensor to cloud compute to location or some edge computer location. But in this case, is it more the latency to perform the operation? Is that what we're talking about when we're talking about improvements here?

Dave: Yeah, when we think about latency in machine learning, there's a question of, if I send an image to my accelerator, how long does that take for it to come back? That says an opposition. Not opposition, but that's different than throughput, which is, how many say images can I process per second? The reason why it's different is because you can process multiple images in parallel. So you could process either in batches where you might be processing groups of 16 at a time, or you could be processing them in a pipeline where maybe you're processing four at a time. They're sequential, but it takes four time slots before the time they get through the system.

And so when we look at digital systems processing neural networks, it's a big challenge where neural networks and other machine learning technologies have a huge number of weights that they need to work with. It might be, say, 10 million or 15 million weights to a neural network. Digital system, especially in the embedded space, typically can't store all of those weights on chip at the same time. Because the chip will be too small to have that much SRAM. You just don't have like tons of megabytes of SRAM typically on a small microchip. And so what they'll do is, they'll load the first part of the neural network, process that, then load the second part, the second set of weights. Process that part of the neural network. Load the third set of weights, process the last third of the neural network, and then repeat for every frame. But since there's delays every time you're loading a different section of the neural network, there's a delay getting the weights from DRAM or other storage, but typically DRAM. In order to amortize that cost, systems will often do this batch processing where they'll process, say, 16 images at a time. And so now you have a delay for loading the next section of the neural network, but that delay has been amortized across 16 inputs instead of one.

In neural networks, this is a big impact. Because running a section at a time, that's being done for every single frame. So if you're running, say, 30 frames per second to match the video speed, that means 30 times a second we're loading the first section, the second section, the third section, the first section, the second section. So those delays add up. What analog computing does for us is, we actually have — I mentioned that embedded flash technology earlier in the segment. We're actually using this.

Flash transistors are these storage devices. You actually have them in your phone now on your SSD. So your SSD to your laptop or the SD card in your phone, these devices are built to use flash transistors. A flash transistor is a special type of transistor that has a floating capacitor inside of it that can store charge. And so you'll store thousands of electrons inside of each of these transistors. Instead of storing just two levels, like one or zero, you can actually store multiple levels. Because instead of storing, say, 2,000 electrons, you store 250 electrons. That means some number. This is called multi-level cell technology. High-density SSDs actually have this today. You can buy a hard drive on Amazon that uses multiple levels per cell.

What Mythic does is, instead of storing, say, 16 levels on one of these flash transistors, we'll store 128 levels on it. That allows us to have not just 10 megabytes of weights on the chip. We actually have 80 million weights on chip. And so we're able to store the entire neural network on the chip at the same time, because we're able to store hundreds of levels on each flash transistor. By doing that, we don't need to load sections of the neural network. We can run the entire neural network in one shot. We don't need to batch the processing. So we can run a single frame at a time. And so our latency going through the neural network is much faster than what you'll see in a digital system. So any application like a AR, VR or systems where you've got moving parts, like the sorting machines I mentioned earlier, or navigation where latency ends up being very critical, this technology can make a huge difference.

Erik: Okay. Great. Thanks for walking through that. Very interesting. It's a completely different way of storing data and then processing that allows this capability. Two more questions that I wanted to touch on, Dave — the first is how you work with customers. From a customer perspective, what does engineering with analog look like versus what they might be doing today? If somebody is building an IoT device and they say, "Hey, I want to explore this. I want to build up a prototype," what does it take to build a prototype? Then if they say, "I want to go move this to production," what does that take? How do you work with them? Do you basically just ship the chips and say, "Call us if you have questions," or does it tend to be more of a deeper engagement through the development process?

Dave: Today it's definitely more hands-on engagement, just because we're in the early days of the company. Anytime you're developing a new software, it tends to be a little rough around the edges. And so we work very closely with the customers that we have today to address the problems that we're running into. That's not unusual for the space.

The long-term vision is that it will be the same as, say, doing what's called quantization aware training, and compilation for digital systems. Anytime you're deploying a neural network onto a system, you need to analyze how efficiently is that neural network being executed on the system. Every system has limits and memory in compute. And so depending on how you develop the architecture of that network, it's going to impact the performance on what you see in Silicon. And so researchers will do what's called a network architecture search, where they have a tool that will search for the most efficient neural network architecture given the hardware platform that you're targeting. We will plug into those tools just like any of the digital systems.

Then for the analog computing, today we use what's called the analog aware training. So you can take your neural network, and you're training with the model of the analog effects. That allows the neural network to adapt to those analog effect. Just like for you currently, when you're looking around your room, you don't see all the noise that your eyes have inherently in them because your brain filters those out. But if you're really groggy in the morning or something, sometimes your vision will get a little stack noise in it. Because since you're tired, that filtering isn't as effective. Analog training for neural networks is similar. The neural network can learn to ignore any analog stochasticity. And so we provide the tools to train. Then we are going to make it automatic just like it is for quantization aware training.

Erik: Interesting. Okay. That's a different topic, but I've been listening to a few podcasts lately on how the brain processes information. It's really fascinating and quite counterintuitive but different.

Dave: Yeah, definitely when I read about the brain and the signaling that it has, it's pretty incredible compute technologies that we run on as people. I believe the accuracy for human neurons is similar to the ones that we use at Mythic. We operate with 8-bit precision. If I remember right, the neurons in the human brain are around 7 bits. And so we're definitely able to do very capable and powerful algorithms with this type of technology. It's a matter of building the right software and systems around it.

Erik: Yeah, that's right. The more you learn about how humans process information, the more you feel like, yeah, we could probably build a silicon computer that basically operates as capably as the brain. At least, in some areas, right? I mean, there's nothing magical about the brain. It's just a very sophisticated processing machine.

Dave: We're in the early days in all of these. Understanding the brains are early days. The neural network research like ChatGPT and all that stuff, those are early days. Then analog computing that we're doing is also very, very early. It's our first-ever first generation today. Definitely, the first time you take some technology to market, you learn like, "Oh, every single decision I made, I know how to do that one step better now." And so we're really excited about our second gen. We have an exciting roadmap after that as well.

Erik: Well, that was going to be my last question, which is, basically, what's exciting to you about the future? You have a second gen coming. I don't know if there's anything else, but what would you like to share about the future of Mythic?

Dave: Yeah, our first gen, we have today. On the second gen, we are immediately able to do an order magnitude leap in terms of energy efficiency and performance. And so that's why we're able to get into that consumer space and really bring powerful neural networks to consumer devices. Say, running 4K video at 30 frames per second for your consumer camera system is not something that you could conceive of today.

After that, LLMs are making a big splash today. In that server environment, we talked about the half-million-dollar pair of servers. Huge amount of power, huge amount of material cost. What if you could take that, and bring it down to something say like $1,000 that you could put into something the size of a toaster and run on, say, 100 watts? That would be a big impact for businesses, in the industrial space. I don't know if that's something that would impact the consumer space due to the costs. But maybe if Apple makes it, we could sell that. But I think that's where we went ahead next. It's trying to shrink the giant server down to something that you could conceivably put on your kitchen counter.

Erik: I mean, even just getting this into a shipping crate that somebody can put on their corporate campus. Because I talked to so many people in industrial about using LLMs. And just putting data onto a cloud even if it's Azure, for a lot of factories, it's just a no-go, right? And so they just want on-premise for a lot of these applications.

Dave: You don't want your data being disseminated. But also, you need to worry about uptime, right? If your internet connection goes down, you don't want your factory to stop. And so being able to have everything locally allows you greater reliability, better latency, and scalability. And so I definitely see LLM is moving to the edge. Cloud is great for new technologies and deploying it at scale. But ultimately, real life happens at the edge. And so that's where we think we can make a big impact. It's bringing those products to where they're used.

Erik: Awesome. Well, Dave, thanks so much for taking the time to talk us through this today. I will definitely reach out to you in two years, and would love to have an update once you’re launching the next generation. I really appreciate it. Thank you.

Dave: For sure. Glad to be here. Thanks for having me.

No account yet?

Transcript.

Contact us