Back to archive

Thread

3 tweets

1
BigQuery? So the funny thing is that Spark moved more and more towards SQL. I see few uses of the low level RDD interface. But then a system like BigQuery, or even AWS Athena, or any of the other query engines with SQL interface seem even better b/c no need to spin up a cluster
2
@rorcde With Spark you also need to manage data layout (partitions) pretty well to have good runtime. Something closer to a "real" DB may even have better performance. When Spark started, scalable DBs weren't that common, now there are many options.
3
@rorcde At the other end of the spectrum, people rather spin up a notebook server with lots of RAM and use pandas so they don't need to change tools.