The Right and Cool Way of Building Robots

A fireside chat with K-Scale Labs

and

Dec 11, 2024

On November 26, in Menlo Park, we hosted a session titled “The Right and Cool Way of Building Robots” as part of the event “The Future of AI in Robotics and Wearable Hardware.” Zhaoyang Wang, chief editor at GenAI Assembling, introduced Pawel Budzianowski, co-founder and CTO of K-Scale Labs, to discuss innovative approaches to building robots in a startup.

Pawel shared his journey from NLP to robotics and K-Scale’s progress in developing open-source platforms, managing supply chains, and advancing simulation learning. He highlighted AI’s role in transforming robotics, the value of low-cost production, and the importance of real-world data for training smarter robots.

Below is the full transcript of their conversation:

Zhaoyang: Welcome to our first meetup! Today, we have a fantastic guest, Pawel from the robotics startup K-Scale. He’s the CTO and co-founder of this incredible humanoid robotics company. Over the next 30–40 minutes, we’ll discuss the best and most innovative ways to build robots in a startup, covering all aspects of this fascinating topic. But before I dive into my questions for Pawel, let me ask the audience: have any of you heard of the Kardashev scale?

Audience Member: It’s about civilizations and energy consumption—measuring development by how much energy a civilization can harness, right?

Pawel: That’s absolutely correct! At K-Scale, we strongly believe robots are essential for maintaining and advancing our civilization. To reach the next level of progress, we’ll need significantly more energy—because as we deploy more robots, we’ll eventually harness energy from all the stars in the galaxy. That might sound like science fiction now, but I hope it will become a reality one day.

Zhaoyang Wang: Exactly. I brought up this question because the company name, K-Scale, is inspired by this theory. Let’s start there. Why did you choose this name, and how does K-Scale work toward fulfilling that vision? What are you working on now?

Pawel: Sure, a bit of background first—nice to meet you all! I’m Pawel, and we started K-Scale just a few months ago. Ben, our CEO, joined Y Combinator’s Winter 2024 batch. I met him a couple of weeks before I traveled to Europe. My background is in large language models (LLMs) and conversational AI. I first met Ben at a breakfast that somehow turned into a four-hour conversation. We connected through the internet, as many in the LLM and robotics community do.

Around that time, some fascinating research was coming out of Stanford, Berkeley, and DeepMind—like the papers showing small robots playing soccer autonomously. When I saw that, it reignited my passion. I told my wife, “I need to build one of these robots!” So here I am, nine months later.

As Ben and I talked, we realized the technology landscape for AI and robotics was evolving quickly, especially in areas like actuators. It led to what seemed like a crazy bet: we genuinely believe that, within a few years, personal humanoid robots could cost less than $5,000. It might sound absurd in 2024, but consider how unexpected ChatGPT’s breakthrough felt just a couple of years ago. If this happens, exploring the galaxy with robots—teleoperating from Earth while sipping coffee—doesn’t seem so far-fetched. That’s how we landed on the name, inspired by the Kardashev scale. Though I’ll admit, not everyone catches the reference!

Zhaoyang: Fascinating. So, what has K-Scale been up to recently? I remember meeting Ben three months ago, and he mentioned some exciting developments. What progress have you made in these few months?

Pawel: We officially launched the company in March. Let me show you something—What you see here is a $350 robot, entirely open-source. This project began just 30 days ago as part of a hackathon we hosted. Over the past month, we’ve supported developers using our software stack and expertise to build robots like this one. Our goal is to create a foundational layer of software and machine learning tools to make robotics accessible to everyone.

The difference we aim for is affordability and functionality. Imagine small LLMs running on these robots, enabling them to perform genuinely useful tasks. Right now, this particular robot is still fairly basic—it can do push-ups—but it’s a starting point. Our mission is to empower developers to build the future of robotics, one accessible tool at a time.

Zhaoyang: So it seems that you’re building your own robots to sell, but at the same time, you’re creating a platform for developers. You’re offering everything they need, and as much as possible, you’re making the tools open-source. Plus, you have a large community and host hackathons. That’s a lot of work! Why take on so much, and is the funding you’ve raised enough to support all of this?

Pawel: What I can say is this: over the past few years, we’ve seen enormous amounts of money wasted on autonomous driving. In comparison, we may look like we have minimal funding—how can you compete with Tesla, right? But I think there’s a path forward. We are laser-focused on market demand. Our motto is simple: don’t die as a robotics company, because 99% of robotics companies fail.

Our goal is to create robots that are affordable enough for most people to buy and then build upon. They don’t need to be as flashy as Tesla Optimus or as feature-rich as Unitree robots. They just need to be good enough for others to innovate on top of them. With this framing, I believe even a small team, armed with a few million in seed funding instead of hundreds of millions, can achieve great things. We’re working with the right manufacturing partners in South Asia to make this vision achievable while opening up significant opportunities.

Zhaoyang: So what you’re working on is ambitious, but it’s achievable, right?

Pawel: Absolutely. We’re currently waiting for some larger robots to arrive at our Menlo Park office—though don’t tell our neighbors; they might freak out!

Zhaoyang: Cool! Let’s dive deeper into your products and software. Starting with the robots: how many are you building now at K-Scale, and what skills do they have? You’ve mentioned you’re not trying to replicate something like Optimus or Figure. How do you decide which skills to train your robots in?

Pawel: Long-term, our vision is to be the go-to open-source platform for anyone starting a robotics project. Everything we do is open-source. For example, we even hold our scrum meetings online, so anyone can watch or join and ask questions about their own projects. To get there, we’re building a robust tech stack.

Our first priority is creating a robot operating system that doesn’t rely on ROS. Instead, we use Rust because it’s much simpler and faster. This system also needs a single, deployable neural network that’s fast enough to adapt to any problem. That’s our North Star, and it drives us every day.

From a software perspective, we’re building this foundation 24/7. On top of that, we’re focused on a unified model approach. While traditional robotics relies heavily on model-predictive control (MPC), we’re diving into reinforcement learning and imitation learning for manipulation tasks. This neural network will power our robots, enabling developers to create innovative applications on top of it.

On the hardware side, we’re starting with a semi-height humanoid robot. It’s designed to be affordable—about 10x cheaper than what’s currently available. It will stand around 160 cm (roughly five feet) and have essential capabilities like walking, talking, and basic manipulation.

We’re also focusing on usability. For instance, our robots will support a web interface to stream video data and enable teleoperation. The neural networks behind them will eventually be able to perform practical tasks, although this is still a work in progress. For now, some can even do push-ups!

Our goal with this basic platform is to test the market and understand how many people are willing to buy it and innovate further. Looking ahead, we’re exploring two other key markets. The first is industrial robotics—building large humanoids for labor-intensive tasks, like those from Unitree or Figure. However, manufacturing costs for this segment are still too high. The second is education. For instance, students in Guatemala are building small humanoids using our tools, which cost just $300. There’s significant untapped potential in educational robotics.

So for now, we’re targeting medium-sized humanoids as a practical starting point.

Zhaoyang: It sounds like you’re building a kind of Lego system for humanoid robots. But you can’t meet everyone’s needs, right? I’ve seen your videos online—your approach seems to involve imitation learning for manipulation and AI-driven locomotion. I’ve talked to other robotics experts, and there seem to be two mainstream approaches. One is end-to-end, like what Elon Musk advocates as the future. The other is a compositional approach that combines different technologies to create a more feasible solution. What’s your perspective?

Pawel: A year ago, when I started exploring robotics literature and attending conferences, it felt like revisiting the NLP world of 2016. Back then, I was doing my PhD in conversational AI and training models on a single GPU with datasets of about 10,000 dialogues. Fast forward to today—companies like OpenAI and Anthropic process that many dialogues daily for minor fine-tuning tasks.

When I looked at robotics, I saw a similar situation. Researchers are still training models on a single GPU for their PhDs, primarily because you can’t run large language models in the cloud for robotics. Physical robots need to operate at 50 Hz, meaning they need to process and act on data every 20 milliseconds. This makes robotics inherently harder—you not only need smart AI but also hardware-optimized, ultra-reliable systems. That’s why so few people venture into this space.

Historically, robotics has relied on mathematical models to achieve tasks like walking. Recent advances have allowed simulations and game engines to accelerate learning. For example, in a simulation, you can reward a robot for walking in a straight line for 20 seconds. This has become a standard benchmark. However, tasks like manipulation—picking up a bottle and opening it—are far more complex because they require precise torque and friction control, which are difficult to simulate accurately.

Zhaoyang: You’ve already shared some of your background, and I found it fascinating when I first heard your speech last month. You’re an NLP expert, an algorithms guy, transitioning from the large language model field to becoming the CTO of a hardware company—a robotics company. Robots are a hot topic these days, largely because AI advancements are significantly influencing progress in robotics. Could you walk us through how large language models (LLMs) are shaping—or as I like to say, invading—the robotics field?

Pawel: Sure. The beauty of today’s machine learning world is that progress in one domain is easily transferable to others. A year ago, LLMs were the primary focus. Now, if you’re 22 years old and entering the machine learning field, your first exposure is likely with vision-language models—tools that let you input an image and discuss it. It’s no longer just about text.

Fast forward, and we now have open-source models that handle speech input and output. In just a few months, we’ll see truly advanced video models. Transitioning from LLMs to vision-language models and beyond isn’t a huge leap; each new data type—audio, video, or motor feedback—becomes just another dimension for the neural network.

In robotics, the only additional dimension is the state information from the robot—details about the motors or joints. Feeding this data into the network makes the training process nearly identical to what we do with LLMs. For me, adapting to the ML side of robotics was just about understanding how to handle this extra input.

However, the hardware side has been a completely different challenge. It’s been nine months, and I’m still learning the intricacies. Everyone warned me, “Pawel, don’t go into hardware—it’s tough.” At the time, I thought they were overreacting. Now I understand them, but I still love robots. I’ll stick with it, even though there’s a steep learning curve.

From an ML perspective, the influence is clear. Robotics today is being shaped more by NLP and ML advances than the other way around. While this is challenging, it’s also incredibly exciting.

Zhaoyang: What you’re describing feels metaphorical—not just for your journey but for the future of robotics. In five years, I imagine many founders will have similar backgrounds, blending software and hardware expertise. When preparing for this fireside chat, I thought a lot about the relationship between software and hardware. For example, Bitcoin is software, yet it can push hardware to its limits. Similarly, smart software can make hardware look inadequate—why pair brilliant software with a “dumb” robot?

Do you think this dynamic could shift in the future? Could hardware start influencing or even enhancing software? Especially now, as people talk about scaling challenges and hitting a wall, do you see opportunities for hardware to play a more significant role?

Pawel: For a long time, I believed that adding more data—like video or audio—would naturally make models smarter. But then Ilya proved us wrong. Even with just text, we can build incredibly intelligent models.

I think the next leap will come when we have thousands of small robots performing everyday tasks—doing laundry, shopping for groceries, or walking and talking with people. If these robots gain a degree of autonomy, the possibilities are immense.

For instance, imagine telling a robot, “Buy me some groceries,” without specifying a detailed list. The robot might buy a Coke and a Diet Coke, which could annoy you—like it would annoy me—but that interaction is invaluable. The robot would learn, “Pawel doesn’t like Diet Coke because it’s sugar-free.” This kind of real-world feedback is almost impossible to replicate with annotated data created by humans.

It may sound far-fetched, and we’re still thousands of hours of collective work away from achieving it, but once we do, this kind of autonomy will generate data that drives algorithms forward in unprecedented ways. Robots won’t just complete tasks; they’ll begin to understand how their actions influence the world. That’s the next big step.

Zhaoyang: That’s fascinating. I recently read an article in The New Yorker about robotics. It mentioned an intriguing idea: the human hand isn’t just a sensor—it’s a kind of vision processor. Even in total darkness, you can identify an apple by touch. It’s similar to how vision works. Do you think this aligns with your idea that the way we collect and process data will change, and that hardware advancements will play a key role in this evolution?

Pawel: Absolutely. The form and collection of data will continue to evolve, and hardware innovations will be crucial to this progress.

Zhaoyang: Let’s shift to the real-world challenges of the robotics industry. It’s a complex field, especially when it comes to shipping products. You’ve mentioned before that shipping even one robot can cause sleepless nights. What’s your view on the industry and its supply chains? What’s the current state, and where do you see the ecosystem heading?

Pawel: Coming from Europe, I’d love to see a global ecosystem where regions like South Asia, Africa, South America, North America, and Europe collaborate seamlessly. Unfortunately, the current dynamics suggest otherwise.

At K-Scale, we design everything in-house but rely heavily on manufacturers in South Asia. In Shenzhen, for instance, you see unmatched speed, efficiency, and quality—at a cost that North America simply can’t compete with. That’s why we’re betting on partnerships with manufacturers in this region.

Our open-source approach reflects this belief. We want our designs to be built by as many companies as possible. Our focus is on software—building smarter platforms for robots—but this vision can’t be realized if developers and researchers struggle to access hardware. I’ve heard from people at Berkeley and Stanford who can’t get enough hardware, which is ridiculous. How can we progress when there are only about 100 operational humanoids globally?

The only way forward is large-scale, cost-effective manufacturing. History has shown this approach works, and we’re confident it will again.

Zhaoyang: We’re running short on time, so let me ask one last question before opening it up to the audience. I heard your team rents a house in Palo Alto, where you live and work together. It’s just a three-minute drive from Marc and Jason. How’s that working out?

Pawel: (Laughs) I can confirm that a16z is not our investor—yet! But yes, we do have a robotics house. It’s a space where we live, work, and collaborate. We even have a PhD student in residence.

Our house is open to developers interested in working on robots or building their own projects. We also host hackathons and will announce the next one soon, once we receive a larger platform shipment from China. If you visit, you might even bump into Marc on a walk!

Zhaoyang: Thank you, Pawel. This has been a fantastic conversation.

The Right and Cool Way of Building Robots

A fireside chat with K-Scale Labs

Discussion about this post