Yesterday’s VMware Communities Roundtable podcast centered around the VMware Wavefront acquisition. Bill Roth (@BillRothVMware), Director, Product Marketing at VMware’s Cloud Management Business Unit, was in studio to talk about what Wavefront’s product is and the thinking behind the acquisition. We (maybe just me) went down an interesting speculative path of what might be done with the product. Eric was especially interested in
the Software as a Service (SaaS) nature of the product delivery.
Here’s the podcast: [flash player] [mp3 direct download]
embedded:
Cross Cloud Services
As Bill pointed out, VMware talked about a suite of Cross-Cloud Services at VMworld 2016, but everything was in Technical Preview. Chris Wolf, VMware’s Chief Technology Officer, talked about those services in the following terms:
- An SLA/Availability dashboard
- Policy-based placement and optimization
- UI and API-driven cloud service broker
- Automated discovery
- Centralized multi-cloud cost accounting
- Workload migration
[…] We will deliver those capabilities as SaaS services in the future. And in terms of management, we will continue expanding our capabilities […]
Wavefront doesn’t fit any of those product descriptions, but could be easily considered an expansion of the planned capabilities, as a SaaS-delivered management solution. Interestingly, it’s an acquisition. If it officially joins the “Cross-Cloud Services” family of offerings, it would be the first to have a product which can be purchased.
What Does Wavefront Do?
What does Wavefront actually do? You can see it as a cousin to management solutions such as vRealize Operations or especially vRealize Log Insight. Bill pointed out that vRealize Operations collects information about every 5 minutes. vRealize Log Insight collects logs as they’re generated from the logging sources with attached metadata (RFC 5424 syslog, for example).
Wavefront is focused on enterprise metrics rather than logging. That’s an interesting distinction that Bill mentioned. There’s a pretty good eBook available at the Wavefront site entitled A Practical Guide for Metrics Monitoring and Approaches (registration wall) which helps explain the difference (page 9).
A log entry should give you information detail about a specific event, whereas, a metric should just let you know that an event occurred. The specificity that a log message allows is what makes it valuable, but you only need that specificity when a problem arises. In normal operation, log files contain anomalies, not stacks of information telling you everything is ok (ideally, separate the information types).
Metrics can be incredibly smaller (than log messages) because they convey considerably less information. They’re also extremely easier to evaluate. This has real impact around how metrics can be stored, processed and retained. Metrics are better suited to giving you a good idea on how the system is performing, i.e. for overall monitoring of the system (uptime, KPIs, etc.).
Demos
So how is that useful? Well, there’s some pretty interesting Wavefront product demos at the Wavefront site you might want to check out. I don’t want to annotate all of them, but here they are with some highlights that I saw.
- Find Signals in Noise (2:26)
Wavefront charts the CPU load across a group of 40 servers, too much information to see and understand clearly. We see a real-time average (!) chart of the group. We’re examining a thesis that the CPU behavior is correlated to a code deployment event. We see a side-by-side comparison of the CPU chart line to the last hour to see if it’s an hourly event and 24 hours earlier to see if it’s daily. Finally, we see the real-time creation of a ratio of one line to the other to see that before the deploy, the ratio was about 1, but after the deploy, there was approximately a 35% increase. That calculation was real-time. Very cool. - Data Exploration
- Prevent False Alarms
- Search by Behavior (1:48)
Rather than showing metrics by the identity of the source, Wavefront can show you by the current behavior. It can also do chains of conditions (currently above a certain threshold but were below it an hour ago). The example is request latency exceeding 150ms. Instead of looking at the history of 30+ app servers, Wavefront quickly zeroed in on only the ones behaving badly (10), and then ones behaving badly and which were behaving well an hour before (5). - Full Resolution Data
- Correlation Function (2:46)
The customer was having regular issues during a specific hour in the day. The team happened to find a specific switch out of hundreds which had a weird metric curve during the exact time of the problems. Wavefront was able to quickly find all the switches which were experiencing similar traffic. But even better, Wavefront has a way to find those same issue even without having lucked out on finding the first problematic switch!
Final Thoughts
I don’t know what the acquisition cost, but I wonder if we could realize the value just by adding optional metric monitoring of our base virtualization product and raise customer service levels by doing this kind of real-time analysis and metric creation. I could see us adding value to sensor data inside the data center and in use cases that having nothing to do with IT. Internet of Things is an interesting play, again, at the analytics layer. SaaS providers’ Site Reliability teams would probably love a tool like this. Heck, VMware is just getting our VMware Cloud on AWS service ready for Beta. I wonder if that team will use Wavefront to help locate trouble spots. Perhaps it will become standard practice to embed sensor output in all our software.
Is this space interesting to you? Can you see it adding value to your job at all?
image sources
- blog-anamoly-hard-feature: Wavefront.com