Finding rare events in software applications is one of the principal reasons artificial intelligence succeeds in increasingly complex environments, says a DevOps trouble automaton expert.

“It’s telling you this cluster of events is both unusual and unlikely to be random,” said Ajay Singh (pictured left), founder and chief executive officer of Zebrium Inc., a machine learning analytics provider recently acquired by ScienceLogic Inc.

Singh and Michael Nappi (pictured right), chief product and engineering officer at ScienceLogic, spoke with theCUBE hosts John Furrier and Savannah Peterson at AWS re:Invent, during an exclusive broadcast on theCUBE, SiliconANGLE Media’s livestreaming studio. They discussed advances is the processes for finding root causes of software problems. (* Disclosure below.)

Scaled misunderstandings

The problem with traditional fault-finding is that humans can’t scale quickly like data can, according to Singh. That’s because modern cloud applications, with the plethora of microservices, containers and so on are creating ever more complex environments. That’s all exacerbated through the increasing speed by which changes get rolled out. “Software breaks,” he said.

“People develop new features within hours, push them out to production. The human has just no ability or time to understand what’s normal. You need a machine,” Singh explained.

“You can’t manage what you don’t know about,” added Nappi. “Visibility, discoverability, understanding what’s going on in a lot of ways, that’s the really hard problem to solve.” That’s where AI comes in, and Zebrium has its own specialized approach to things.

“At its heart it’s classifying the event catalog of any application stack,” Singh explained. “Figuring out what’s rare, when things start to break, it’s telling you this cluster of events is both unusual and unlikely to be random,” indicating the root cause of the problem.

The process of identifying issues with more accuracy has changed as services have become more prevalent in information technology. “You can’t hire enough engineers to scale that kind of complexity. They use machine learning to tremendous effect to rapidly understand the root cause of an application failure,” Nappi said of Zebrium’s AI approach.

Here’s the complete video interview, part of SiliconANGLE’s and theCUBE’s coverage of AWS re:Invent:

(* Disclosure: ScienceLogic Inc. sponsored this segment of theCUBE. Neither ScienceLogic nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)

Photo: SiliconANGLE

Show your support for our mission by joining our Cube Club and Cube Event Community of experts. Join the community that includes Amazon Web Services and Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger and many more luminaries and experts.


Source link