QE AI Code Quality in DevOps with Itamar Friedman • Test Guild

About this DevOps Toolchain Episode:

Welcome to another exciting episode of the DevOps Toolchain Podcast! Today, we’re diving deep into AI Code Quality in DevOps with our special guest, Itamar Friedman, CEO and co-founder of Qodo, formerly Codium AI.

Itamar shares his insights on how automation—specifically AI—is poised to exceed expectations, drawing parallels with historic technological revolutions like electricity and transistors.

We’ll explore the evolving role of developers as AI takes on more tasks, emphasizing that while AI can increase productivity and skill set, human expertise in complex problem-solving and code verification remains irreplaceable. Itamar introduces us to Qodo’s innovative tools that significantly enhance test suite coverage and code quality and discusses the importance of developing AI as a core skill, much like learning a new programming language.

We’ll also tackle the current and future landscape of end-to-end testing, the concept of “flow engineering,” and the complex balance between code generation and quality assurance. Itamar offers a glimpse into the future of technical product management with AI-driven feature generation and the potential emergence of new roles such as “AI Guardrail Engineers.”

Prepare for an enlightening discussion on the challenges, opportunities, and advancements in AI-driven software development. Let’s get started!

[fusebox_track_player url=”https://traffic.libsyn.com/testguildperf/tgdItamarQEAICodeQualityinDevOps165.mp3″ social_linkedin=”true” social_email=”true”

About Itamar Friedman

Itamar Friedman is the CEO and Co-Founder of CodiumAI, Before founding CodiumAI, Itamar was a founder of Visualead which was acquired by Alibaba Group. He then worked for Alibaba Group for 4 years as the Director of Machine Vision.

Connect with Itamar Friedman

Company: codium.ai
LinkedIn: www.itamarf

Rate and Review TestGuild DevOps Toolchain Podcast

Thanks again for listening to the show. If it has helped you in any way, shape or form, please share it using the social media buttons you see on the page. Additionally, reviews for the podcast on iTunes are extremely helpful and greatly appreciated! They do matter in the rankings of the show and I read each and every one of them.

Transcript

Download New Tab

[00:00:01] Get ready to discover some of the most actionable DevOps techniques and tooling, including performance and reliability for some of the world’s smartest engineers. Hey, I’m Joe Colantonio, host of the DevOps Toolchain Podcast and my goal is to help you create DevOps toolchain awesomeness.

[00:00:19] Hey, AI is all the rage, but how can I help you with code quality? That’s what we’ll be talking all about today with Itamar, who is the CEO and co-founder of Qodo previously CodiumAI? And before founding Qodo, he actually was the founder of VisualLead which was acquired by Alibaba Group, which probably everyone knows about. And he’s worked for the Alibaba Group for four years as a director of Machine Vision, which I’m really excited to dive into as well to see what that’s all about and really excited to hear more about this new technology. It can help DevOps, engineers, testers and everyone with code quality, which I think is a good use of AI. You don’t want to miss this episode. Check it out.

[00:00:57] Joe Colantonio Hey. Welcome to the Guild.

[00:01:01] Itamar Friedman Yeah, Thank you so much for having me here.

[00:01:04] Joe Colantonio Great to have you. I guess before we get into it, is there anything in your bio that I missed that you want The Guild to know more about?

[00:01:09] Itamar Friedman Yeah, I think that was good enough. I think like throughout my career I worked a lot of quality stuff, whether like I can add to what you said. I worked on a Chip Verification in Mellanox and also like system verification, etc., And then when you’re doing machine learning, I have a master and master degree of machine learning, you’re doing a lot of with benchmarks. It’s not a surprise that eventually, when we came into disrupting the software development world, we care a lot about quality and testing and reviewing. And that’s also part of the reason why we’re renaming the company CodiumAI. Although I love the name Qodo with a Q in order to emphasize the quality of development, that’s what’s behind the Qodo name.

[00:01:53] Joe Colantonio Nice. Back in the day when I started, quality was relegated to a group of testers. And I’m not saying developers didn’t think about quality, but it wasn’t the top of the list, I don’t think because they had a group that handled it. Obviously, there’s now a shift where developers are more and more responsible for quality. Is that why you see more of a need for tooling to help them not only create excellent code and make sure it’s of high quality?

[00:02:14] Itamar Friedman Yeah, that’s one reason but there’s another reason that AI was going to, is already writing more and more for our code and probably in the future is going to like have even automation level of code writing by AI is going to increase. There could be like lines of code, maybe many of them that the human would not review. We have to have the counterpart. We don’t want to have those outages where like half of the world is can’t work because of blue screen or whatever. I’m referencing to one famous outage from last few last month. In order to enable that level of automation of AI writing more and more code, we have to have the counterpart that focused on the quality. You can ask yourself, by the way, okay, if the same AI is like writing code. Why it shouldn’t like also check its own code. The thing is that I’ll keep it short to start with. First, it’s different techniques. Yeah. Maybe both cases like LLM is being like one of the core components, both in code generation and code quality. By the way, sometimes we call it code integrity. This field, I can explain, but so it could have some mutual building blocks like an LLM, but it’s still that the overall system looks very different, the UI looks different. And also you don’t want to let the cat make sure that he’s guarding his own milk, right?

[00:03:33] Joe Colantonio Yeah. That was my follow up. How can you trust A.I. written code being tested by AI? Where does the human come in at this point?

[00:03:40] Itamar Friedman Yeah. So first let me, answer your first part, it’s like saying, how could you trust code written by human if it’s just this by another human? Don’t you think like there is some equivalent and both of them are humans. The thing is that they use very different tools in like in a platform like Qodo. But we have tools that integrate mutation testing as part or we have a static analysis that is focused on understanding behavior of the actual usage. We are focused on analyzing best practices of organization and then making sure that they are being applied. The techniques and in the user interface and the methods, by the way, I’m going to share like and use that unfortunately, like it’s only a few days from now that we’re going to share the actual numbers. But we took, for example, o1 of the new models by OpenAI and they released that the number they share that they competed with o1 on a competition called code forces. And supposedly o1 thinks by itself, you don’t need to put it as part of an agent. It could like do all the steps that think by itself about the problem. Okay it happened to be that we developed an additional agent none additional, sorry. Like an independent agent called AlphaCodium. We call it like that because to give credit through AlphaCode by Deepmind. AlphaCodium, you can take any model and also compete on that programing competition but this time also applying testing technique, code review technique, spec matching technique. Surprisingly or not, with o1. When you put it inside AlphaCodium, AlphaCodium improved the results of o1 by 50% better. Why? Because it exploits techniques that o1 cannot like around testing and code review. And then the overall outcome of that agent AlphaCodium that is using a model like o1 that can supposedly think is much better. It’s like why should we trust that to check out other AI because they’re not the exact same entity, similar to how a QA is not the exact same entity not using the same tool as a developer.

[00:05:54] Joe Colantonio All right. How does that work? Could you break it down for me? Like, is it you don’t prompt it? It just knows like this code does this. So therefore I’ll want to test these inputs and things that are going to crawl it automatically and be able to determine states that it should and should not be in?

[00:06:10] Itamar Friedman Actually great question. In a paper we released factually, by the way, and what I’m sharing is open source. It got accepted by many like even Karpathy tweeted and related to it, etc.. Basically, we call it from prompt engineering to flow engineering. So your question is like spot on. It’s not just using the right prompt. Actually, it’s almost not a lot at all about that. It’s actually developing a flow where as part of the software generation of the code generation, we don’t only use LLM to generate code, we also use LLM to generate test. We also use other techniques that are not related to A.I to run the test, debug the code, verify that the specification meets the actual development. So that’s why we call it flow engineering, because think about it more like as an algorithm that runs a step by step running different LLM calls different prompting, different tool usage, until it generates a code that was actually reviewed in different ways. And o1 by itself cannot do that. Like it doesn’t have the framework to do that. It doesn’t have the tools that’s connected to it. Two things that are important, the framework and the tools.

[00:07:26] Joe Colantonio All right. Once again, how does this work then? How do you envision this working in a modern software development company then? Do they just say, here’s what we want? They handed off to an AI that then codes and creates it and then calls an agent and then test it and then lets you know whether or not they’re going to ship it into production. What does that flow look like?

[00:07:47] Itamar Friedman Yeah, great. I’ll start with five years so now and derive it back.

[00:07:53] Joe Colantonio All right. Awesome. All right, cool. Even better.

[00:07:56] Itamar Friedman Think about like, in five years from now, I think many of us believe that there’s going to be some technical product manager. Notice I didn’t mention a developer, but I did mention technical writing some feature. Let’s imagine that like I am a technical product manager and one of the biggest Ecommerce, and now I’m dictating a feature to my AI, hey, every time one of my users is adding a second item to the cart where it somehow relates to the first cart suggests the color that will fit the first one and then the AI generates like five different like flows for that, including the actual code underneath it. And then the technical product manager chooses according to different criteria. I’m not going to touch on that one of them. And then clicks push the production. That’s the part that we need to stop for a second and there have to be that additional dashboard that shows here all the tests that were run to verify this code, here all the tests that were run to verify other flows that are relevant to that nothing was broken. Here’s what we checked that this code platform checked that maintainability of the code and according to the company’s best practices and even according to the company’s values. Let’s not touch that for a second. And by the way, we noticed one piece of code that we’re not sure about, and you should send it to your dev. Okay. So I think how I imagine it is that there’s going to be like intelligent coding system and intelligent coding solution that is in charge of writing the best code it could. It will probably work quite fast. I’m relating here to thinking fast and slow. On the generation part, you probably work as fast as you can, but now that you understand that this is probably what you mean, now it’s time to think slow and think how do I want to refactor, verify, test, etc. before I pushed to production? And that’s a different technique and a different platform. Now driving back to today, in order to do that. In real world enterprise software, there are so many things that needs to be attached, checked, and mocked. It will take time to get to that future. And I think that this future will be achieved by many, many agents. Each one of them is doing something relatively compartmentalize, like really specific. Like this one is very good and then verifying that you’re not breaking databases. This one is very good and verifying on unit testing, this one is good on UX testing like end to end, etc.. And then each one of these agents needs to run once you generated some piece of code and give feedback back and get the feedback back in order to give a greenlight or something that needs to be improved. And now at Qodo and other companies as well, each one is building a different building block. Eventually, someone is going to put his hands on all of this building blocks, and then we’ll be able to enable the next generation of AI developer and enterprise, which doesn’t exist today.

[00:11:00] Joe Colantonio All right. Now, I’ve worked for health care companies, insurance companies, completely different domains. But you needed to understand all these regulations and rules with then each domain. In AI, I think would be generally trained. How would it know your system or your domain eventually, would there be an LLM for health care and alarm for insurance? Like I mean, obvious you can have one that writes code and tested, but has to know the context of your company or your software or your business domain, if that makes sense.

[00:11:29] Itamar Friedman Amazing question. Makes a lot of sense. That’s actually very related to my point. I think that why do I claim that there is going to be many agents on the integrity part? Because I think that each agent will have a like a set of guardrails and data that is related to its task. Like one was well engineered. That’s why we call it flow engineering on paper. One was well engineered to deal with the compliance and security and different like best practices related what you expect in software and health care, and specifically would also know how to attach itself to specific code bases of specific organization. And it will verify that the code generation part of the system actually adheres to this. And I think like to create each part of this specific agent requires a lot of expertise and time invested in it. And yes, like it would be different guide guardrails and guidelines giving in to health and or security or education, etc. like markets. Again, I’m talking about like a few years from now. I think we’re still building those building blocks and capabilities.

[00:12:39] Joe Colantonio Do you see like a role of like an AI GuardRail engineer type deal as a possibility?

[00:12:45] Itamar Friedman Let’s say? You know, it’s like a sub like you have front end, the back end, etc..

[00:12:51] Joe Colantonio Yeah.

[00:12:52] Itamar Friedman I definitely seeing like you have like security experts, etc. I definitely seeing that type of sub role.

[00:12:59] Joe Colantonio All right. I mean if someone’s listen to this and they’re a developer tester, they might be like, this is just disruption at its highest. And I think it’s going to happen. A lot of people think it’s hype. How do you know what’s hype and what’s real, though? Obviously, you’ve been in AI game and machine learning before. It really started taking off the past few years. How would you persuade someone to say, hey, look, this isn’t just hype, it actually works?

[00:13:22] Itamar Friedman Okay? So first I want to relate to these people think this way that actually can I understand them? Because I think right now there is like people tend to overestimate the capabilities of AI today. I think like eventually there is no real solution right now to create like software end to end. If it’s not, despite all the amazing demos that you see out there, there are usually like five pages website that you can probably just to yourself by choosing some GitHub repo. Now, it’s still amazing that you need to go and search the GitHub repo which is better and actually make modification etc. It might take you a few hours and then I’ll take you 20 minutes. It’s amazing. But we’re far from actually having automating like the creation of real world software that’s run by millions. The gap between these five pages of website to a real like software is huge and it will take time to actually fully disrupt that part. I can understand people that are actually like professionals and looking what we have today and saying, Oh my God, everybody are exaggerating. At the same time, I think they might be underestimating the power of like using this technology and slowly or quickly building system like comprehensive system around them, like the one that I just described the last 15 minutes. Once we actually integrate A.I., not just like, hey, spit like this website for me, but we integrated to apply verified best practice. We integrate it to check 100 different like it will take time to engineer it out. But we make it. Help us verify 100 different types of possible issues and bugs or groups of them. And another one that is in charge of verifying that we’re not ruining our like, databases, etc.. I’m focusing right now on that integrity part. Then suddenly, there could be an opportunity of a level of automation that is really, really high. And eventually it will even exceed our expectation. I think that happened with real, let’s say dark or magic technology like electricity or resistors like chips and then things like that, transistors, etc. that originally people imagined like AI when Transistor came out. But it took much more time. But eventually it came the same thing here. Eventually it will come. That’s what I have to say to those people, that there is full systems that are being built that will be more than what you see, and they are the one that going to like provide the promise of automation that we’re talking about. But still, I think people will be involved by way like developers going to like they’re going to profession are going to change but they’re going to be a lot of developers.

[00:16:05] Joe Colantonio All right. There will be a lot of developers. So how much do developers need to know about coding then? Or language syntax and testers need to know about testing at that point? It almost sounds like you only have to be an expert developer and tester, not a low level one, but if you have A.I doing it, how do you get to be an expert? What AI doing with all the heavy lifting?

[00:16:25] Itamar Friedman Yeah, I very much agree with your second part. I’ll explain. I think that the majority of people that are working right now are like, there’s like a Gaussian, right? Like you have quite a few junior but the majority is like somewhere between junior to mid and then mid to senior. And then you have the seniors and the principals. And I actually think that what’s going to happen or maybe the more accurate thing like being junior about coding like not the amount of years of experience how good you are like an actual software development. So that’s like a Gaussian. I think if you were going to we’re going to see two Gaussians like actually there are going to be much more like many people, like 10 x or 100 x more people like kind of that by the way. 100 x more people can do design or outputs of designs. So we’re going to be like 100 x more people can do software development because they need to write technical English. At the same time, I think that with all due respect to a I relate to what I said before, is going to take many, many, many years until you don’t need the developers to verify, to review, to deal with the hard problems. And actually, there’s going to be so much code written by AI. There is going to be a lot of code that developers needs to tackle hassle with, debug and steer like other AI to work together with the AI to solve. We will be able to solve much more complex problems that I’m talking about a flying to Mars but also like software problems and with that, developers very like first of all with AI would be able to solve thing. And then, considering this world, how do we get to a point where there’s many more like developers, it’s so professional. And then before that open question, I don’t think I have a good answer, but to some extent I think AI could push people that do want to learn, actually learn faster, like I can tell you myself than it with I am learning faster if I care about it right if I just like I could do copy paste from anyway so I can do even copy paste faster now with the AI. But if I care about the learning, actually, I can ask many questions that will take me a lot of time. Okay. Why did you give me this code? Why it could work? Why it could not? What are alternatives? Think it will take me a lot of time might help like actually push to even a bigger portion of professional developers.

[00:18:49] Joe Colantonio Yeah. That reminds me of saying developers won’t be replaced by AI, but developers with the AI will replace developers. I don’t know if that make sense. But it’s going to be an assistance almost sounds like.

[00:18:59] Itamar Friedman Yeah, and the same thing for testers, by the way. They related to developers? But I believe the same thing I said copy paste almost to a testers.

[00:19:07] Joe Colantonio Yeah. So what can I not do? I can’t think of the algorithm that I was read about when I started university about I think it was a halting problem was something that like no matter what computing power you have, you can never get over that hurdle. There has to be something with the AI that it’s not going to be able to do. Or is that in determined because we don’t know what we don’t know.

[00:19:28] Itamar Friedman Yeah, I think that I’m not necessarily going to answer that you expect that not a mathematical one or so despite like promising websites or chat bots that are actually replacing people, replacing professional, I don’t know like people are talking to bots more and more. I just want to mention names, but I think you know what I’m talking about. Then I still think that even 10 or 15 sorry, 18 months or 36 months or even 5 years from now, I don’t believe that, despite the fact that AI could talk like a human. It will actually replace the human interaction. I know that’s not what you meant but I think that actually the AI won’t replace the interaction between the product manager to the business owner to the developer. Like a lot about like creating the intent of what we want to do is going to be augmented by AI. But I still think like decision of what we actually want to do. And is going to be by humans. Like again, I’m not saying that it’s not going to be involved by AI. We’re going to brainstorm with the AI and AI would give us like recommendation and even statistics like, hey, this has the chance or 51 successes chances to succeed. And because of this and that when we’re one to make a decision. But still I think like the bigger part of decision and communication, which includes like brainstorming, the creativity, it’s going to happen by people. Developers like we said before, is going to shift more to a technical product managers than they were before.

[00:21:04] Joe Colantonio All right. So we touched on a lot of topics here. I don’t think we really dive in and we touched on a little bit what actually you all do at Qodo Codium. So maybe you mentioned mutation testing. So like, what do you all do? How does it help? I know you touched on it, but maybe like a little bit more?

[00:21:20] Itamar Friedman Yeah, sure.

[00:21:21] Joe Colantonio Yeah.

[00:21:22] Itamar Friedman By the way, just elaborate a bit more about what my previous question and then I’ll add actually fits well this one. So you might have thought of maybe like things like architecture would be hard for AI. And I think I see a lot of people talking about that. Like AI would have a hard time about software architecture and things like that. In my opinion, one day it will come as well. AI can learn all AWS and actually maybe can even read logs and other things that for us is hard to do. But I still think that the eventual architecture decision will be made by human. Just the suggestions would be made by AI which are not. There is no tool like that today. And that leads me to your question. I think like, for example, in order for AI to suggest architecture decision, copilots can’t really do that and LLM cannot really do right now. You would need to integrate like other techniques into such kind of a solution. I’m not only talking about knowledge that it would train, but actually like thinking processes and thinking frameworks. What we do at Qodo. Yeah, it’s the first time, by the way, that kind of podcast AI mentioning this name. So I hear that sometime Qodo I heard you saying but it makes sense like QODO.

[00:22:34] Joe Colantonio Qodo, Qodo.

[00:22:34] Itamar Friedman Yeah, yeah. But Qodo is actually also, you write it the same, Qodo. Qodo is like our solution. Yeah. It’s the first time. You’re the first person that probably hear just such thing, like we’re announcing it in a few days. The thing is that we have different agents, right? We have, for example, what we call the cover agent. This agent is in charge of taking an existing test suite, for example, even one test and increasing the test suite coverage, for example, from 10% to 80%. It is right now doing only regression test, but it’s a huge problem. Like every time you want to add a new feature, you’re afraid you’re breaking something. One of the ways to reduce that is like creative immense testing suite like reaching 80, 90% full coverage regression one. And then you know that if you developing a new feature, you shouldn’t break the regression, the test. What it does, it takes for you like your 1 or 2 tests you have in a test suite and enhance it for 10 20, that really depends like have the criteria to stop at a certain code coverage. Now, code coverage, that’s the criteria where you stop it. Actually, many would consider that as a proxy metric, even a vanity metric. I heard one. I have one friend that told me that he’s a VP R&D. His teams, they must reach the 80% code coverage. One day he checked one of the repos that reaches 85. He was so like proud and he noticed that there is no asserts. Not bad ones, no asserts. So it means that the code coverage is like almost none reliable at all because there’s some value for it because of the tests crash but it’s not really checking logic. What we think is that you need to apply and by the way, AI many times might do that. We notice it. You call even very good LLM and it might create you like bad or not complete our good asserts. Now, mutation tests. It’s a specific technique, but what it does is create mutants in your code actually changing the logic. And it wants to check that the test suite actually kills them. For example, if there is no assert in your test, probably it won’t kill, won’t fail the mutant. It means that despite having a high code coverage, you’re actually having a bad test suite. This is like an example of a technique that used most code generation tools LLM would not exploit. And if you’re focusing on quality like we do at Qodo, then we do that. Yeah.

[00:25:07] Joe Colantonio Awesome. Yeah. Code Quality, especially with coverage metrics. I’ve seen teams gamify it where they just put in checks that are meaningless and they get a high code 80, 90%, but it’s not doing that. That’s awesome. Yeah. All right. We talked a lot about development in using AI. How about production? Do you see with open telemetry and things like that that once you can actually use AI in production as well, that yes, the code has been tested, but once it goes in the wild, you don’t necessarily know third party services it’s using or whatever it’s using in the background because it’s in the wild. It’s not in your bare metal type of environment. So there’s a lot of unknowns. Do you see AI being used now to say, okay, there’s an issue because it’s open telemetry, I know the logs, let me fix it without having a human involved in that process.

[00:25:56] Itamar Friedman Yeah, totally. I think when we started the company, then we thought, what do we need to do like philosophy level to reach 99.99 code integrity. For us, code integrity is like means that it’s well tested. The code is according to our specification. The performance is as you expected. We can even inject like security in it, but it’s maintainable. That’s what we call code integrity. I mentioned I got to get to it and yeah, that happened to explain what I mean by code integrity. And now, when we imagine how can we get to 99.9, what is the thing we imagine is what are the sources of data that the such system would need? And we actually marked seven data sources and I’m not going to talk about all, but definitely one of them is that the software, the running all the data relates to the actual run of the software and production. You mentioned open telemetry. It could be like a very good glimpse into that, with not like as EBBF for whatever that gives you up down to the stack, like the function level and stack in most cases. But you can like output like very important parts as far as I know of the data. And that runs through your software and the event that’s happening in your software, etc.. I definitely think that we must have this kind of information to eventually verify the integrity. I’ll give an example. The same method, like even a sort method, think about a sort method might work or not work, might be good or bad depending on the distribution of data that run through it. And usually, you know the distribution of data that run through that sort function only when you actually run it in production I gave a simple example, but now take it a bit more complicated just also like to the more complicated example.

[00:27:50] Joe Colantonio Love it. You mentioned security a few times, I would think doesn’t this make security even harder Because even using third party open source libraries, you’re like, it has vulnerabilities. How do you know? Like bad actors are going to almost like seed in AI and I would think would make it even harder to detect these type of things is something people could be concerned about is AI are going to build a help. Do you see regulations saying, hey, we can’t use AI for this because we’re worried about security? Is that something that will be worked out over the years if it comes to that?

[00:28:18] Itamar Friedman Actually, I think it could go to two directions, to be frank. One, one future is that you were going to have like 5 x, 10 x more line of code generated every year. It means that the Risk surface actually grows. Security. One, two, it could be like bugs, which we saw like also could be really meaningful, three data leaks, etc.. The Risk Surface grows as we use the AI more and more. And that’s one future. And that’s why it’s very important to have more and more tools. And this is roughly speaking, how we imagined the world is going to, because I think like people would be very biased toward generating really fast. And then usually, when you think very fast, even if you’re an AI, you will do mistakes. And the security part and the verification part, etc., would think slower, let’s say like that. And there’s another future where, hey, maybe we can train. During the training part, can train AI so well, including managing to catch like injections that people try to put in the data that AI was about to train on or we train so well to manage to recognize that that’s going to split like a security problem that it will help us to actually, to start with, have less problems. That’s the second future. In my opinion, we’re going to see a mix of both. We’re not going to see the risk surface explode, totally explode, like although lines of code and automation is going to explode, it’s just going to grow linearly, let’s say like that. And at the same time, it’s still going to grow. And we will have to have those tool if we want to have our world keep working because the software world runs on software.

[00:30:10] Joe Colantonio All right. So we’ve touched a lot on AI for development, and we did talk about testing, but I’m just curious to know a lot of my audience, but this is for the DevOps show, but I also have a one for automation testing. Where do you see AI with testing specifically or with the future of AI in testing. If someone is listening in, they see themselves more as a tester.

[00:30:28] Itamar Friedman Yeah, I love the question. Thank you. And I’ll tell you why. Because I think like many people hear the word testing and it sounds like one thing. Actually testing is such a huge like field, you have the triangle of testing will bottom up like a unit testing, component testing, integration testing, system testing, End-to-end testing, etc, UI testing, even UXX testing I would claim there is different flavor like regression testing or smoke testing or testing for new features, testing for regression like I mentioned, etc. It’s such a huge role and unfortunately and unfortunately, to check software, you probably need them all. Asking like the AI for testing is like asking like working and all this entire like field like big field, where can we gain the most? We at Qodo, we think about it quite a lot like and I think like each testing has different tricks, UI, and different techniques. We started with unit testing and I think that’s like the closest to code generation to some extent. But I think like AI could really help for example, a lot with UX testing like maybe we’ll call it end-to-end testing, like testing for example, flow’s good and bad flows on your website. Think about it like very intuitively, you can describe what you’re doing in natural language. There’s even the Cucumber or whatever like different languages, Playwright, etc. that are almost English on how you want to like. It’s not like that far fetched to think that AI could help there. I saw a few tries. I don’t think anyone nailed it. And again, shameless plug. Be aware of we are going to release something in that field very soon. But I still think like there’s potential about that. Bottom line, my suggestion around it and my intuition and insight is that like testing is one of the more most complicating part of software development, just what I mentioned. And I think it will take time for actually AI to integrate there because it’s such like a complex world, but it’s going to happen. And as a tester, look at tools, try them, be that futuristic tester and as a manager, think about what’s your biggest pains around testing and check it out. Maybe you will find something that is different than just auto complete can really like meaningfully boost your quality and productivity.

[00:32:43] With Model AI, it could do visuals now. Where do you see the future of end-to-end testing then? Is there going to be all image-based where you just click, click, click because you don’t need to code it. It seems like because it understands the visuals of the flow. Hope that make sense. It knows what a button is, knows what a text box is, and you just say into like just using visuals. And then you can take that and run on a mobile device, all these devices because it’s just going by image base of visual cues rather than text prompts. Does that make sense that what you see?

[00:33:14] Itamar Friedman Yeah. I think there’s a difference between what’s possible to what I think will actually be implemented. I will explain. I think like it is possible.

[00:33:22] Joe Colantonio Even better.

[00:33:24] Itamar Friedman Yeah, exactly. I think it will be possible to supposedly create testing agent that only focuses on your visual instructions or maybe even explores your website or mobile app by itself as a human only through the visual. The thing is that I think different than us human and developers, sorry human and maybe testers that it’s hard to us, for us to have the capacity to look on the code and visual simultaneously. There is no reason that actually AI won’t do that simultaneously because there are still a lot of knowledge, sorry, information that can be instructed from the code itself. By the way, also in order to do test healing or to get some hints about like the next steps, etc.. And by the way, also in order to try to find ways to break your flows, maybe there is going to be hints not in the visual, but actually hints in the code. Although it is possible, in my opinion within even the next year to build a testing agent that is focused on the visual itself, I believe at least that’s what we’re doing at Qodo to mixing between the two modalities. Now, the last reason why, because I think like for some things, it’s actually much better and easier to check on the visual itself because some things are very dependent on your chrome version or things like that with all of that indexing, what’s the actual thing that humans are looking. But at the same time in the code, there’s a lot of hints and information that you can get to try to find like good flows or bad flow that could break or make things.

[00:35:11] Joe Colantonio Awesome. Okay. Itamar, before we go, is that one piece of actionable advice you can give to someone to help them with the AI DevOps efforts? And what’s the best way to find contact you or learn more about Qodo?

[00:35:23] Itamar Friedman About my advice might be a bit generic, but I would say using AI is like a skill with like using when your programing language etc. Don’t let it be perfect before you start checking it. Exploring different. Don’t only use AI for code completion. There are so many other options you know to embrace AI or exploiting AI. For example, the shameless plug if you go to Qodo AI, you will find that we have various tool that helps you around things on quality that are not necessarily code cogeneration. That’s my advice. Like try out, like develop that skill and become like that the future software developer. About reaching us, we just saw rename their company from Codium AI to Qodo AI, so you can find us on Social Qodo on our website from there like a probably the best way to land and you can find me Itamar usually like a Friedman and you’ll find me Itamar again. You’ll find me on X on Twitter as well.

[00:36:29] For links of everything of value we covered in this DevOps Toolchain Show. Head on over to Testguild.com/p165. So that’s it for this episode of the DevOps Toolchain Show. I’m Joe, my mission is to help you succeed in creating end–to-end full stack DevOps toolchain awesomeness. As always, test everything and keep the good. Cheers!

[00:36:51] Hey, thank you for tuning in. It’s incredible to connect with close to 400,000 followers across all our platforms and over 40,000 email subscribers who are at the forefront of automation, testing, and DevOps. If you haven’t yet, join our vibrant community at TestGuild.com where you become part of our elite circle driving innovation, software testing, and automation. And if you’re a tool provider or have a service looking to empower our guild with solutions that elevate skills and tackle real world challenges, we’re excited to collaborate. Visit TestGuild.info to explore how we can create transformative experiences together. Let’s push the boundaries of what we can achieve.

[00:37:36] Oh, the Test Guild Automation Testing podcast. Oh, the Test Guild Automation Testing podcast. With lutes and lyres, the bards began their song. A tune of knowledge, a melody of code. Through the air it spread, like wildfire through the land. Guiding testers, showing them the secrets to behold.

Scroll back to top

About this DevOps Toolchain Episode:

About Itamar Friedman

Connect with Itamar Friedman

Rate and Review TestGuild DevOps Toolchain Podcast

Sign up to receive email updates