I tried original one, but didn’t support
(x64) I just can’t install
scipy as a matter of fact,
scipy doesn’t provide pre-compiled solution for Windows in the first place, and installing python in way, I think, I have no hope for it.
So next, I go for another option which I can properly setup my complier, namely cygwin.
I have read somewhere, it will work if I install python (32 bit) instead. But why should I, this is 2017!?
I tried cygwin … it failed to build
I first installed these packages from the cygwin’s installer.
python3-devel(you need this if you want to install packgaes like
After installation, Python 3 will take the name of
python3 to make it work as a default
python I did some
ln -s and some
export PATH to make it present in the default
I tried to install
numpy which I did install successfully. (arduously long install due to the compilation time)
scipy now it failed with the following error:
numpy.distutils.system_info.NotFoundError: no lapack/blas resources found
I didn’t seem to find any solutions or workaround online … so I paused there !
So, I give
conda a try … It seems to work
You don’t have to install the full-big anaconda to use
conda, you can just install the smaller miniconda it will do the job.
conda has its own way to install packages, I think it still leverage the use of
pip in some point but also with some modifications. So in short,
conda is a front-end to multi-platform mulit-language package installation.
When I install
scipy it seemed to me that
conda also install
mkl (I don’t know much about this, it’s like a math optimization library for Intel CPUs or something).
But at the end, it did install
numpy successfully ! yay …
multiprocessing is slow under Windows
After happily using it for a while … I spot that using
multiprocessing.Pool is quite slow, it’s slower than single core processing! I’m quite sure that I’m not new to this. I know how to make it work properly, at least under unix-like environment.
So, I dug deeper into this problem … which turned out to be problem with the Windows itself.
It’s not to blame
conda for it, it’s the problem between Windows and Python
As you know,
multiproccesing.Pool using multiple processes to leverage a speed boost, which threading cannot in Python (because of the existance of Global Interpreter Lock).
This problem doesn’t seem to have much problem in the user-end point of view, the usage of
multiprocessing is itself easy enough, as long as you keep your program functional.
However, it is said that spawnwing a process under Windows environment is much much slower than in Linux. As been asked in here: http://stackoverflow.com/questions/8775475/python-using-multiprocess-is-slower-than-not-using-it. To me, the argument is quite valid. I have long heard that Windows favors multi-threading and Linux favors multi-processing, so it is sure that on which OS which one will be optimized.
Unfortunately, multi-threading for speed boost under computational intensive is not supported in Python, and multi-processing is not very well supported by Windows. I have left with almost zero option, it seems quite a deal breaker.
chunksize might help
As a matter of fact, Windows’ process is quite slow to start, this greatly reduces the viability of python’s
multiprocessing.Pool. If spawning a new process is slow, then should we spawn less processes and distribute a larger bit of work for each process instead ? This is exactly what
chunksize parameter in
from multiprocessing import Pool with Pool() as pool: for result in pool.imap(fn, jobs, chunksize=16): ....
So, in short, you should set
chunksize somewhat larger than 1 and see if the problem mitigates.
I will try to use it for the moment … hope I can live with it :D