Automated build using Jenkins Pipeline

In my earlier post on using Spark for financial analysis, we were either running the code within IntelliJ or doing a manual Gradle build. In this post, let's take a look at how we can leverage Jenkins pipeline for automated builds. The code discussed in this post is available in Github. Jenkins is a web … Continue reading Automated build using Jenkins Pipeline

Running Java application in Docker Container

In one of my earlier posts, we looked at how to create a Spark Application to read the Data from a CSV file. In this post, we'll take a look at how Docker image can be created for the Spark Financial Analysis application so that it can be easily run inside a container. The code … Continue reading Running Java application in Docker Container

Financial Data Analysis using Kafka, Storm and MariaDB

In my previous posts, we looked at how to integrate Kafka and Storm for streaming loan data and cleansing the data before ingesting it into processing pipeline for aggregating the data. We also looked at how to leverage Liquibase for managing the relational database in form of immutable scripts that could be version controlled. This fits … Continue reading Financial Data Analysis using Kafka, Storm and MariaDB

Microservices – Database management using Liquibase

Over the last few years, the proliferation of microservices and cloud native architecture patterns have surfaced new challenges that are resulting in new tools and techniques being adopted by enterprises. This allows for a seamless transition for enterprises in their cloud native journey to be more agile and nimble. In this post we'll look at … Continue reading Microservices – Database management using Liquibase

Financial Data Analysis – Kafka, Storm and Spark Streaming

In my earlier posts, we looked at how Spark Streaming can be used to process the streaming loan data and compute the aggregations using Spark SQL. We also looked at how the data can be stored in file system for future batch analysis. We discussed how Spark can be integrated with Kafka to ingest the … Continue reading Financial Data Analysis – Kafka, Storm and Spark Streaming

Stream Processing using Storm and Kafka

In my earlier post, we looked at how Kafka can be integrated with Spark Streaming for processing the loan data. In the Spark streaming process, we are cleansing the data to remove invalid records before we aggregate the data. We could potentially cleanse the data in the pipeline prior to streaming the loan records in … Continue reading Stream Processing using Storm and Kafka

Financial Data Analysis using Kafka and Spark Streaming

In my earlier posts on Apache Spark Streaming, we looked at how data can be processed using Spark to compute the aggregations and also store the data in a compressed format like Parquet for future analysis. We also looked at how data can be published and consumed using Apache Kafka which is a distributed message … Continue reading Financial Data Analysis using Kafka and Spark Streaming