Back to archive

Thread

3 tweets

1
@justincormack @lproven @kmett I only went into this a bit while working on jblas. Numerical workloads like matrix-matrix-multiplications are interesting because on the one hand you can exploit the parallelism from having multiple FPUs in a modern CPU ...
2
@hisham_hm @justincormack @lproven @kmett ... while also organizing the flow of data (GBs of data) such that you exploit locality in a way that you don't get slowed down by the memory hierarchy. And then there are vectorized operations, and nowadays GPUs of course, and so on.
3
@hisham_hm @justincormack @lproven @kmett Very interesting thought what it would take to map all of that into a language. But C is definitely a waaaaay too simplified view of modern CPUs to give you a handle to control this.