Perspective <> Clickhouse Integration
This guide demonstrates how to integrate Perspective with Clickhouse to visualize fast-moving data streams. By following this example, you will learn how to:
- Set up a connection between Perspective and Clickhouse.
- Embed an interactive
<perspective-viewer>
in your web apps enabling real-time data visualization and analysis.
Overview
Clickhouse is an open-source columnar database management system (DBMS) designed for online analytical processing (OLAP) of queries. It is known for its high performance, scalability, and efficiency in handling large volumes of data.
Main Technical Advantages:
- Columnar Storage: Optimized for reading and writing large datasets, making it ideal for analytical queries.
- Compression: Efficient data compression techniques reduce storage costs and improve query performance.
- Distributed Processing: Supports distributed query execution across multiple nodes, enhancing scalability and fault tolerance.
- Real-time Data Ingestion: Capable of ingesting millions of rows per second, making it suitable for real-time analytics.
Perspective is an open-source data visualization library designed for real-time, fast-moving, and large data volumes. It provides a highly efficient and flexible way to visualize and analyze data streams in web applications.
Technical Advantages:
- Real-time Visualization: Optimized for handling and rendering large datasets with minimal latency, making it ideal for dynamic and interactive data visualizations.
- WebAssembly and Arrow: Utilizes WebAssembly and Apache Arrow to achieve unparalleled performance in data processing and rendering.
- Multi-language Support: Offers support for multiple backends, including Python, Node.js, and Rust, allowing seamless integration into various development environments.
Primary Use-cases:
Together, Clickhouse and Perspective are widely used in industries such as finance, telecommunications, and e-commerce for applications that require real-time analytics and reporting. It excels in scenarios involving fast-moving or time-series data, such as:
- Monitoring and Observability: Real-time monitoring of system metrics and logs.
- Financial Analytics: High-frequency trading data analysis and risk management.
- User Behavior Analytics: Tracking and analyzing user interactions on websites and applications.
- Real-time Ad and Impression Analytics: Analyzing ad performance and user impressions in real-time to optimize marketing strategies.
Demo Architecture & Components
This demo includes the following components:
docker.sh
: Starts a Clickhouse Docker container.producer.py
: Generates a random stream of data and inserts it into Clickhouse every 250ms.perspective_server.py
: Reads the data stream from Clickhouse and sets up a Perspective Server. Multiple Perspective viewers (HTML clients) can then connect and provide interactive dashboards to users.prsp-viewer.html
: Demonstrates how to embed an interactive<perspective-viewer>
custom component in a web application.
Getting Started
Start by cloning our git repo:
git clone https://github.com/ProspectiveCo/perspective-examples.git
And navigate to:
cd examples/clickhouse
You can also follow the rest of these instructions on Github: https://github.com/ProspectiveCo/perspective-examples/tree/main/examples/clickhouse
1. Start a Clickhouse Docker Container
Start by pulling and running a Clickhouse docker container to host our demo data.
./docker.sh
This script performs the following actions:
- Pulls the latest Clickhouse Docker image and runs a new container called
prsp-clickhouse
. - And it sets the database user/pass to
admin/admin123
.
2. Set Up Python Virtual Environment
Next, create a new Python 3 virtual environment to manage the dependencies for this demo.
python3 -m venv venv
source venv/bin/activate
pip install -U pip
pip install -r requirements.txt
The requirements.txt
includes packages for clickhouse-connect
and perspective-python
. The perspective-python
package is the Python binding for Perspective, allowing you to create and manage Perspective tables, views, and servers. It facilitates the integration of real-time data visualization AND real-time publishing of data to Perspective Viewer clients.
3. Run the Producer Script
The producer.py
script is responsible for creating a Clickhouse table, generating random market data, and inserting it into the Clickhouse Docker container at regular intervals.
To run the producer script, execute the following command:
python producer.py
This script performs the following actions:
- Creates a table named
stock_values
to store the market data. - Generates random market data, including timestamps, ticker symbols, client names, and stock prices.
- Inserts the generated data into the Clickhouse table every 250ms.
4. Run the Perspective Server
The perspective_server.py
script is responsible for consuming data from Clickhouse, standing up a Perspective server, and refreshing data every 250ms. It publishes changes to Perspective clients via a Tornado WebSocket.
Open a new terminal and activate your virtualenv. To run the Perspective server, execute the following command:
python perspective_server.py
This script performs the following actions:
- Connects to the Clickhouse database and reads the latest data from the
stock_values
table. - Sets up a Perspective server that creates and manages a Perspective table with the data schema.
- Refreshes the data and updates the Perspective table.
- Publishes the changes to connected Perspective clients via a Tornado WebSocket, enabling real-time data visualization and interaction.
5. Open the Perspective Viewer
The final step is to open the prsp-viewer.html
file in your web browser (Chrome recommended). This file contains an embedded <perspective-viewer>
component, which will enable interactive data visualization and automatically render data in real-time as new values are captured from the server.
Code Explained
Perspective Server
The perspective_server.py
script is the backbone of the integration, responsible for setting up the Perspective server and managing real-time data updates. Here are the key components:
1. Setting Up the Perspective Server
The Perspective server is created using the perspective.Server()
class. This server will manage the Perspective tables and handle client connections.
perspective_server = perspective.Server()
2. Creating a Perspective Table
A new Perspective table is created with a defined schema. This table will store the data read from Clickhouse and will be updated periodically.
schema = {
"timestamp": "datetime",
"ticker": "string",
"client": "string",
"open": "float",
"high": "float",
"low": "float",
"close": "float",
"volume": "float",
"date": "date",
}
table = perspective_server.new_local_client().table(schema, limit=1000, name=TABLE_NAME)
TABLE_NAME
is the name of the websocket table that we'll later use to connect from our Perspective Viewer client.
3. Updating the Perspective Table
The updater
function reads data from Clickhouse and updates the Perspective table. This function is called periodically to ensure the table contains the latest data.
def updater():
data = read_data_from_clickhouse(clickhouse_client)
table.update(data)
logger.debug(f"Updated Perspective table with {len(data)} rows")
The tornado.ioloop.PeriodicCallback()
is used to call the updater
function periodically, ensuring the Perspective table is refreshed with the latest data from Clickhouse at regular intervals.
# start the periodic callback to update the table data
callback = tornado.ioloop.PeriodicCallback(callback=updater, callback_time=(INTERVAL * 1000))
callback.start()
4. Setting Up the Tornado WebSocket
The Tornado WebSocket is set up to serve the Perspective table to connected clients. This allows real-time data updates to be pushed to the clients.
def make_app(perspective_server):
return tornado.web.Application([
(r"/websocket", perspective.handlers.tornado.PerspectiveTornadoHandler, {"perspective_server": perspective_server}),
])
5. Running the Server
Finally, the Tornado IOLoop is started to run the server and handle incoming connections.
app = make_app(perspective_server)
app.listen(8080)
loop = tornado.ioloop.IOLoop.current()
loop.call_later(0, perspective_thread, perspective_server, clickhouse_client)
loop.start()
These components work together to create a robust system for real-time data visualization using Perspective and Clickhouse.
Perspective Viewer (client)
The prsp-viewer.html
file is responsible for embedding the Perspective Viewer component in a web application and connecting it to the Perspective server to visualize real-time data. Here are the key components:
1. HTML Structure
The HTML structure includes a <perspective-viewer>
element wrapped inside a container. This element will render the interactive data visualizations.
<div id="viewer-container">
<perspective-viewer id="prsp-viewer" theme="Pro Dark"></perspective-viewer>
</div>
2. Importing Perspective Modules
The necessary Perspective modules are imported to enable the functionality of the Perspective Viewer.
<script type="module" src="https://cdn.jsdelivr.net/npm/@finos/perspective/dist/cdn/perspective.js"></script>
<script type="module" src="https://cdn.jsdelivr.net/npm/@finos/perspective-viewer/dist/cdn/perspective-viewer.js"></script>
<script type="module" src="https://cdn.jsdelivr.net/npm/@finos/perspective-viewer-datagrid/dist/cdn/perspective-viewer-datagrid.js"></script>
<script type="module" src="https://cdn.jsdelivr.net/npm/@finos/perspective-viewer-d3fc/dist/cdn/perspective-viewer-d3fc.js"></script>
3. Connecting to the Perspective Server
A script is included to connect the Perspective Viewer to the Perspective server via WebSocket. It loads the data from the server into the viewer.
<script type="module">
import perspective from "https://cdn.jsdelivr.net/npm/@finos/perspective@3.1.3/dist/cdn/perspective.js";
document.addEventListener("DOMContentLoaded", function() {
async function load_viewer() {
const table_name = "stock_values";
const viewer = document.getElementById("prsp-viewer");
const websocket = await perspective.websocket("ws://localhost:8080/websocket");
const server_table = await websocket.open_table(table_name);
await viewer.load(server_table);
}
load_viewer();
});
</script>
These components work together to create an interactive and real-time data visualization experience using Perspective and Clickhouse.
Prospective Commercial Product
Prospective.co, the commercial version of Perspective, offers additional functionality and capabilities, including a built-in Perspective Server data connector. This allows users to build, customize, and share dashboards based on Clickhouse data seamlessly.
To get a trial license of Prospective, send an email to hello@prospective.co with the subject: "Clickhouse trial".
Conclusion
For any additional questions or further assistance, please feel free to email me at parham@prospective.co.
Thank you for following this guide!