InfluxDB uses in-memory indexing along with time-structured merge trees. The latter includes a write-ahead log and read-only files that contain sorted, compressed time-series data. Spill to disk is also available in the event that memory is not adequate.
The product employs a schema-on-read approach and supports both regular and irregular time-series down to nanosecond precision. All data is stored though we would like to see an option to refrain from storing unchanging data or data only changing within (user-defined) tolerance levels.
The environment supports user defined functions that can be written in a variety of languages including Go and Python, and there are SDKs for Java, Scala and R, and support for Jupyter Notebooks. Historically, the company has also offered InfluxQL, which is SQL-like and TICKScript, which is used by Kapacitor. In its latest release the company has introduced Flux, which is a superset of these two. Flux is extensible; a significant feature is that it allows queries to access external data sources to pull in contextual information about, say, devices. From a machine learning perspective, the company focuses on providing a plug-in framework that integrates with third-party platforms such as TensorFlow, and anomaly detection tools. While there are significant capabilities for storing, indexing and manipulating time-series data, as one would expect, the product is relatively weak when it comes to geo-spatial data, which will limit the company’s ability to address some IoT use cases. We understand that the company is looking at how it can address this issue.
Fig 02 - Query Builder
The graphical user interface supports query building and visualisation, dashboarding, and alerting. It allows you to quickly create queries (as seen in Figure 2) and use these to create real-time visualisations and dashboards. It also provides a variety of prebuilt dashboards that are ready to go out of the box. A rules engine is available, allowing you to build rules via the same interface as the query builder, before leveraging these to set up alerts or other actions based on the outcome of those rules. The latter case is particularly interesting, allowing you to, for example, automatically scale your cloud deployments based on a variety of metrics and statistics (such as, in a microservices environment, elastically scaling the number of containers based on the number of application requests, something which is not easy to do using Kubernetes).
More generally, the platform is designed to be a tool for the developer, who can either use the tools provided for that purpose or some other tool such as Grafana. However, there is currently no support for ODBC or JDBC connectivity – the company is working on these – so you cannot use something like Tableau at present. On the other hand, there are various other connectivity options available such as integration with Kafka and support for Slack (widely used by developers), amongst others.