Apache Flink on YARN with Kerberos AuthenticationSetting up Flink on YARN is written pretty much straight forward on the documentation. But what if what you need to do is much more…Sep 17, 20191Sep 17, 20191
Apache Spark Custom LoggingThe example below is only applicable if your spark job runs on yarn deployed on client.Jun 11, 2019Jun 11, 2019
Scala: List and Iterator SimulationA simple simulation on the difference between list and iterator. We all know that in Scala collection’s lineage, Iterable is very much the…May 27, 2019May 27, 2019
AVRO vs Parquet — what to use?I won’t say one is better and the other one is not as it totally depends where are they going to be used.Jan 16, 20191Jan 16, 20191
Who’s this Spark Listener?Jacek Laskowski made a good documentation regarding spark listeners. I made this page since we keep on encountering this ERROR:Jun 13, 2018Jun 13, 2018
Relocating Classes using Apache Maven Shade PluginI just had a weird encounter upon running Spark Job.Oct 24, 2017Oct 24, 2017
Import Mysql data to HDFS using SqoopThe trending topic in big data is now about AIs. It’s quite weird that I am posting something that’s not new in the big data world. This…Oct 4, 2017Oct 4, 2017
Streaming Kafka Receiver-Less ApproachKindly read https://spark.apache.org/docs/latest/streaming-kafka-0-8-integration.html for more information about Receiver-Less Approach.Jul 5, 2017Jul 5, 2017