Tag: dask
-
Pandas Bench Marking
As a beginning data scientist coming from a systems background, i am always into seeing how fast stuff runs and comparing methods to learn the most efficient way. I recently wrote a python script that will run twenty tests. Each test generates a random data file (random number of lines multiplied by 100,000) with a…