Get a DemoStart Free TrialSign In

Interview

8 min read

Contents

For the newest instalment in our series of interviews asking leading technology specialists about their achievements in their field, we’ve welcomed Tim Panagos, CTO and Co-founder of IoT data startup Microshare.

Tim is a technology executive with over twenty years of experience in enterprise software. Tim was most recently Chief Architect of Accenture’s global Business Process Management (BPM) practice, where he led software architecture innovation.

Tell us about the business you represent. How did the idea come about to found your enterprise?

Microshare is the market-leading wireless IoT platform built for the commercial real estate space. The idea came to us about 10 years ago when we were deploying some very cool technologies like IoT, AI and Business Process Management at very large organizations.

The projects were in the tens of millions of dollars, usually for large banks and financial services firms which could afford to spend that kind of money on cutting-edge technologies. Our idea with Microshare was to take these compelling technologies and democratize them so that much smaller enterprises could take advantage of them to drive value in their businesses. That's what Microshare’s mission has been from the start.

Being the CTO, what do your day-to-day responsibilities look like?

As CTO, I look after the budget, staff and roadmaps for product development, IT and cloud services at Microshare. With finite resources and many demands, both internal and external, I see my job as balancing the desire for continual product and process innovation with the need for stable and scalable services.

I am planning for the long term but also dealing with immediate needs and everyday emergencies. I am fortunate to control much of my own destiny, though “current me” often finds reason to curse “past me.”

What frameworks do you use for managing technical debt?

Technical debt is literally what keeps me up at night. We now have a mature platform that’s been running software for over 10 years. We're always looking to modernize, upgrade and maintain the core systems while also adding new features and new products. It's a constant juggling act.

The analogy I use is that it's like building a gazebo out of softwood. As soon as you're done building the gazebo, the softwood starts to rot and it rots at different speeds and for different reasons. You've got weather and you've got bugs. You have to start maintaining that gazebo almost immediately if you want it to stand up – cut out the rotten wood, replace different pieces, paint it and maintain screws. Ultimately, how quickly that stuff degrades will depend on external forces, like the weather, and internal stresses, like how you actually use it.

Software is a lot like that. As soon as you put the software up, it begins to decay. That might be for reasons like cybersecurity because we use libraries or build in different ways that have vulnerabilities; because hardware evolves and you need to make sure that the software is always compatible with the hardware that's being used to run it…all of these factors conspire to creep up the need for maintenance. Use-case drift also happens because the business will use the software in ways that differ from the original design.

Now, some tech debt is consciously introduced. When people write software, they don’t always do the most robust design, the heaviest testing or the best coding because of time or cost constraints. There's always a list of to-dos when we execute any software project. It's a combination of things we wish we had had time to do when we were doing the initial build and that slow rotting over time that you have to address. You've got to service the debt routinely to make sure it doesn't grow unserviceable.

The cost of that liability is the balancing act we go through on our roadmap. What we do from a framework perspective is maybe informal, since we have a relatively small and expert development team. We keep track of tech debt in our code base. We use to-dos in our comments. Whenever we are writing software and we know there's a place where we would like to spend more time, or think there may be a weakness that could be exploited, we'll put these to-dos in there.

We then use our software tools to periodically track and review the to-dos. Over time, we chip away at them. But we also use our product roadmap and our task tracking system to check them as well. My roadmap has a mix of maintenance items and new features, new function items, and I'm constantly juggling those as we look at setting up quarterly goals for the engineering teams.

Of course, there’s also the unexpected. As we scale to a million devices, we run into issues that often come from some level of tech debt. For example, we've run into a scaling problem and we're forced to deal with that debt right then and there. So it's a combination of those kinds of things – looking forward and reducing the friction, reducing the support load, reducing the maintenance load, and then looking for those upcoming bottlenecks that will need to be fixed.

How do you decide whether something is so painful that it merits a full re-write?

That is ultimately the question. It’s something we discuss at our developer meetings almost weekly. We have particularly old or critical microservices in the platform that occasionally come up as pain points. We're constantly asking ourselves, is it worth making targeted fixes or scrapping entirely and re-writing?

I wish I had a real clear metric for this, but what we really look at is the amount of time spent doing maintenance for a routine change. We'll try to estimate upfront how long we think making improvements should take for a particular service. And then we'll look at the difference between what we thought it should take and what it actually took. When the pattern begins to emerge that things are taking routinely longer than you expect, that’s usually when we think it’s necessary for a full rewrite.

It's the complexity that really drives the amount of time. And the complexity grows as these things get older. Eventually, you get so far away from the original intent that you have to go back and look at the design. That's usually when we see that a re-write is necessary.

How does your technical organization work with the other operational components of the company?

Ultimately, we both create the product that the rest of the company sells, deploys and finances, but we also, as a SaaS tool, build a lot of the fundamental components that enable the operations. So it's half product and half IT. We recently split our engineering teams into product and IT teams, but we still work very closely because you need a lot of IT support in order to operate a company that is part software, part hardware and all about data in the middle.

Our operations – from shipping and inventory to finance, customer service and sales – all that is deeply tied to our core platform because it's a SaaS system. If we were just shipping widgets, then we could have a much different separation between product and operations. We are deeply woven into all of the company’s operational components, which makes it a challenge. But it also makes it quite interesting because we're always trying to figure out how to optimize the operations, automate what we can and provide tools for people to work better.

As we scale, it's not just the platform that needs to grow. It's the business itself that needs to scale around it. If you've got a product that can scale but a company that cannot, your customers can't get access to it. If you've got a company that can scale but a product that cannot, you're going to make people really angry because the product isn't fulfilling their needs.

I don't spend a lot of time looking at trends. I spend more time looking at what has worked in the past. How do we extend those practices, tools and technologies into new areas? It's really not about trends to me. It's about doing good, solid engineering.

You see fads come and go. A new operating system, a new programming language, a new buzzword. We try not to go chasing these things because we know that we need to have really bulletproof technology. That's maybe the niche we're in. Growing up building software for banks, we couldn’t take massive risks by using the latest and greatest cutting-edge stuff. We had to be innovative but innovate on a base that is dependable.

That's the line we walk at Microshare. We want to do these big, cool things that have a real impact on people's businesses, but we don't want to be on the bleeding edge because we want to make sure we have the stability and scalability that comes from using tried and true techniques and technologies.

Do your technical teams or do you use log analysis as part of your role? If you do, how do you find this helps day-to-day operations?

Log analysis and performance metrics are really important to what we do because we are very aggressively scaling our platform and our business, constantly looking at where the next bottleneck will be. As you go from one device to a million, there are a lot of plateaus in performance, and with a complex system doing as many things as we currently do, it can often be a surprise what the next limit to growth will be.

Being able to look at how the whole complex system is performing in a clear, consistent and consolidated place has been very important to help us get ahead of problems and make concrete engineering decisions. When we allow tech debt to sneak up on us, log analysis is really important to quickly troubleshoot and reduce it.

By necessity, we look at it very routinely and use the insights that we see there to figure out how we're going to make both tactical and strategic updates that feed into everything we already talked about in terms of tech debt and re-write.

It really is driven by the data that we collect. And so, because we are a company whose customers come to us because they want to be data-driven, we are also a company that is data-driven. Log analysis and performance metrics and infrastructure observability, in general, are critical to that.

What can we hope to see from your business in the future?

As Microshare continues to scale, we're continually adding new analytical tools on top of the data streams that we manage and create. We want to make it as easy as possible for enterprises and large property managers to adopt the solutions without having to understand and go through the entire journey of data discovery.

I think you can look forward to more pointed products but at the same time a broader range of sensors, more data collected, and larger and easier scaling – because the world needs to be measured, and the larger your patch of that world, the more difficult it is to manage the infrastructure. Ultimately, that's a key part of our mission: to keep the risks and effort low for managing this sensing network so that you can focus on improving your core assets and core business without needing to worry about the technology itself.

In terms of geographic growth, Microshare is already a global business. We had an explosion of deployments in all of the different time zones as a result of our work in contact tracing during the pandemic. That has followed as the world has started to emerge from the pandemic period. When there can be no downtime, it's a challenge to keep things running and growing.

We don't have these giant maintenance windows where we can be unavailable for hours. It becomes even more important to think through technical debt reduction, using tools and techniques like log analysis and metrics to make good decisions and minimize downtime. It often feels like we’re trying to maintain the aeroplane while flying it. It's always a challenge, but it does keep things interesting, for sure.

If you enjoyed this article then why not check out our previous round-up on Prometheus monitoring tools or our guide to the best Splunk alternatives?

Get the latest elastic Stack & logging resources when you subscribe

© 2024 Logit.io Ltd, All rights reserved.