Iterate on a date range with Scala

February 14, 2020

Iterate over a date range in Scala, when your dates are string and when you output has also to be a string.

The title alwary says everything. I didn’t find any code to copy&paste on StackOverflow to iterate on a range of dates using Scala to easily delete (or iterate over) partitions in Spark.

Lot of old posts and SO answers were suggesting to use joda-time, many also raised the point that from Java 8 we should use java.time.

I am pretty sure that other people had the same problem, but I was not able to fine an end-to-end solution, so here is the code:

import java.time.LocalDate
import java.time.format.DateTimeFormatter

val format = DateTimeFormatter.ofPattern("yyyy-MM-dd")
val start_dt = LocalDate.parse("2020-02-01", format)
val end_dt = LocalDate.parse("2020-02-20", format)
val date_diff = end_dt.toEpochDay() - start_dt.toEpochDay()

for (i <- 0l to date_diff by 1) {
  val s = start_dt.plusDays(i)
  println(s)
}

Don’t forget that you can replace to date_diff with until date_diff to exclude the last date from your range.

In my case, the println(s) was replace by:

dbutils.fs.rm(f"$s3_path%s/dt=$dt%s", true)

I am writing this here just to google less in the future.