Painless visualization of Microsoft/ethr log with ELK
Microsoft/ethr is a simple network performance measurement tool that works on terminal. As the name shows it is released by Microsoft, and actively maintained with many contributors recently.
image from official repository
I liked it because it is very simple and easy to use(and it's always a fun for me to try new tools). Just running with the following commands provides the logs and nice terminal-based UI of per-host network performance measurement.
Server: TUI appears and ethrs.log
is created in current directory
$ ethr -s -ui
Client: ethrc.log
is created in current directory.
$ ethr -c <server-ip>
Analyzing the result
It is not realistic to analyze the raw log after the long-run performance measurement, which generates millions of lines of log. Fortunately, the log is machine-friendly by default(and inevitably): per-line JSON formatted like below. So I decided to use Fluentd + Elasticsearch + Kibana as aggregator / visualizer.
Sample ethrs.log
{"Time":"2019-05-01T13:06:11Z","Type":"TestResult","RemoteAddr":"100.100.103.153","Protocol":"TCP","BitsPerSecond":"9.98M","ConnectionsPerSecond":"","PacketsPerSecond":"","AverageLatency":""}
{"Time":"2019-05-01T13:06:11Z","Type":"TestResult","RemoteAddr":"100.100.103.181","Protocol":"TCP","BitsPerSecond":"384K","ConnectionsPerSecond":"","PacketsPerSecond":"","AverageLatency":""}
{"Time":"2019-05-01T13:06:11Z","Type":"TestResult","RemoteAddr":"100.100.103.146","Protocol":"TCP","BitsPerSecond":"3.20M","ConnectionsPerSecond":"","PacketsPerSecond":"","AverageLatency":""}
{"Time":"2019-05-01T13:06:11Z","Type":"TestResult","RemoteAddr":"[SUM]","Protocol":"TCP","BitsPerSecond":"315.14M","ConnectionsPerSecond":"0","PacketsPerSecond":"0","AverageLatency":""}
In this post, I'll explain how we can visualize the BitsPerSecond
field and make it analyzable on Kibana like this;
Setting up environments
I don't explain how to install the software, as there are well-organized documentations and many blog posts for each. Instead, I'll give some ethr
-specific tips that I came up with in my struggle during the setup.
For simplicity, all the software or the ethr
server, Fluentd(td-agent), Elasticsearch, and Kibana, is supposed to be installed in the same host.
Handling timestamp in Fluentd
At first, I set time_key
in <source>
section to Time
which ethrs.log
had in each line, but it did not work; the parsed log in /var/log/td-agent/td-agent.log
had "1970-01-01 00:00:00".
This is strange because the time format looked like a standard one(compliant with ISO8601 format), and td-agent recognized the key successfully(the original key/value is omitted in the td-agent.log if time parsing succeeds).
Sample /etc/td-agent/td-agent.conf
<source>
type tail
tag ethrs
format json
path /var/log/ethrs/ethrs.log
pos_file /var/log/ethrs/ethrs.log.pos
time_key Time # <= THIS DIDN'T WORK :(
</source>
<match ethrs>
@type copy
<store>
@type stdout
</store>
<store>
@type elasticsearch
host localhost
port 9200
index_name td.${tag}
type_name td.${tag}
</store>
</match>
*continuous logging is not considered in this configuration
Maybe digging into the Fluentd documentation would solve the problem, but as in the title of this post, I'd like to visualize the logs as painless as possible, so I decided not to let td-agent to parse timestamp. I knew it could easily be achieved by Kibana: just select "Time" when creating Index Pattern on Kibana GUI and it's done ;)
Converting units
The value in "BitsPerSecond" field has human-friendly representation of bits/sec with units. The unit varies like "43.65M", "384K", and "0"(I haven't seen non-zero value without "M" or "K", but there might be). Due to this I needed to convert the unit to make it a single format.
There were two options to address this problem;
-
Edit source code and rebuild
ethr
In the source code ofethr
, it seems it is converted with numberToUnit(), and it's currently hard-coded.
Just removing the converting part from the source is not such a burden, but a bad practice from the maintainability point of view. Maybe sending pull request that aligns the units with new command-line option is better way, but much more tiresome. -
Create Scripted Field in Kibana
Scripted Field is a feature to add custom fields in the query results(not records itself) by scripting. The script language supported is "Painless"(default) and "Lucene expressions".
I chose the latter option finally, as it seemed more fun and it would help in some other cases in the future as long as I continue using Kibana, but this was actually the most painful part, though the name of DSL used in Scripted Field is "Painless"...
Creating Scripted Field
To create a new Scripted Field, go Management > Index Patterns > <index name> > Scripted fields
. The new field I created was the bps
, so the "Type" and "Format" should be "Number".
Painless can be written in Java-like syntax. What the script does here are;
- parse "BitsPerSecond" field as
raw
(return 0 if the field is missing) - remove string("K", "M", "G") contained in
raw
and align units depending on the string - return as Float value
String raw = doc['BitsPerSecond.keyword'].value;
if (raw == null) {
return 0
}
if (raw.contains('K')) {
String num = raw.substring(0, raw.indexOf('K'));
return Float.parseFloat(num) * 1024;
} else if (raw.contains('M')) {
String num = raw.substring(0, raw.indexOf('M'));
return Float.parseFloat(num) * 1024 * 1024;
} else if (raw.contains('G')) {
String num = raw.substring(0, raw.indexOf('G'));
return Float.parseFloat(num) * 1024 * 1024 * 1024;
} else {
return Float.parseFloat(raw);
}
I'm not so familiar with built-in Java methods so maybe the script above is somewhat inefficient. Please let me know if there's a better way.
Visualizing
I did nothing special about visualizing the bps
value in Kibana. Here's some small tips for analyzing ethrs.log
.
- Exclude the lines where "RemoteAddr" is "[SUM]" for per-host analysis.
There are "[SUM}" line that represents aggregated measurement of all the clients inethrs.log
, but it's annoying to see that in the graph when analyzing per-host basis. Just putting the following lucence query will work.
NOT (RemoteAddr:"[SUM]")
- Use average aggregation for Y-Axis.
Plotting per second(=per line) is a kind of disaster for the machine and your browser. Selecting average for Y-Axis and setting interval to auto or some custom value that is not so small(depends on time range to analyze) lets you work on analysis comfortably.
Summarizing up
- Microsoft/ethr is so easy to use. It's worth considering to use if you don't need super-precise measurement.
- It's still in so early stage of development that some features such as no option for logging format are missing.
- Elasticsearch & Kibana, especially Painless scripting language helps customize the log(but with much pain for beginners).
Hope this helps who interested in ethr
. Happy performance measurements!