A month ago (2020 March) I completed the Data Streaming Nanodegree by Udacity. The course was very comprehensive, the only thing I really missed was the capstone project from the end. For that reason I decided to build something, using technologies I learnt. In this project I will build a data pipeline to monitor the production in a manufacturing system, and to do this I will integrate the following components:
- Data Source: OPC-UA server running unreliable machine simulations. ✅
- Buffer: Kafka ✅
- Kafka producer: OPC-UA client subscribed to machine events.
- Stream Processing: KSQL
- KPIs: Availability, Performance, Quality, OEE
- Cycle Time
- Data Store: PostgreSQL
- Visiualisations: Cube.js
At the current state of the project I developed the OPC-UA server and client. Both of these applications has their own Docker containers. To run the server and the client, execute the following commands:
- create a network in docker
$ docker network create OPC-UA
- run the kafka compose file
$ docker-compose -f docker-compose.kafka.yml up
- run the main compose file
$ docker-compose up
- check the content of production.cycles topic
$ docker-compose -f docker-compose.kafka.yml exec broker kafka-console-consumer --bootstrap-server localhost:9092 --topic production.cycles --from-beginning
At this point you should see the events on the console.
- Start the ksqldb-cli and check the production.cycles topic
$ docker exec -it ksqldb-cli ksql http://ksqldb-server:8088
ksql> SET 'auto.offset.reset' = 'earliest';
ksql> print 'production.cycles';
