Latest     About     Archive

Data Engineering interview questions you can expect for a realtime / streaming position

When interviewing for a (senior) data engineering position, expect to perhaps see these kind of questions. Even if you don’t get asked these questions, these are good to know.

Of course, if you don’t know answers in an interview, it is a lot of the time not a problem.
Big Data is a broad field so just answer as best you can and explain your area’s of expertise in as much detail as you can.


JVM:

  1. Explain the difference between Heap memory and stack memory

Scala / Spark:

  1. Why use a case class as a construct when using Spark?
    Benefits of this over a regular class?

  2. How to minimize shuffling in Spark between executors?

  3. Ways to make Spark joins more performant? (keys, broadcast etc)

```