-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mean and scaling, C++ Caffe vs Python wrapper #525
Comments
Yes. The quote from caffe.proto refers to |
Sure, but I mean situation when you train some network from leveldb, get pretrained model file (as it takes a long time). Here you use not scaled mean, as mean is subtracted before scaling. Then you want to use Python to do predictions later on. Obviously you would use the same mean which is wrong in this situation. You should use scaled mean for Python. |
Oh, good point. That does seem incorrect. We'll look into it (submit a PR if you like). |
Can't submit PR right now. If someone else could do it, that would be great. |
If I recall correctly this configuration is actually right for the reference ImageNet model, although this should certainly be better documented. On the C++ side the image is represented as uint8 in 0-255 while on the Python side scikit-image represents the image data as float in 0-1. For this reason the Python wrapper scales first so that it can re-map the data to 0-255 to match the precomputed mean. @longjon I see two options: either 1. make the Python preprocessing pipeline the same as the C++ data layer and change the stored mean, or 2. keep the current method with an explanatory comment. To me 2. is a little more flexible and less work, but might be slightly confusing. |
Actually I am working on Cifar10 Python classifier and use Imagenet as my template and it is very confusing :) |
@shelhamer, I'm confused about your statement that scikit-image represents image data as float in [0, 1]. As far as I can tell, it just invokes PIL, much like matplotlib: $ ipython --no-banner
In [1]: from skimage import io
In [2]: io.imread('examples/images/cat.jpg').flat[:10]
Out[2]: array([26, 57, 49, 27, 58, 50, 25, 55, 47, 28], dtype=uint8) Are you reading images differently? Looking at |
Another TODO for some brave soul: the mean subtraction assumes a 3 channel input and mean in the |
As far as I remember with PIL as long as your image has pixels of type int you have [0, 255] scale, but if your pixels are of type float, then it has to be [0, 1]. |
The interesting fact is, I am having a similar problem trying to predict a class label given a raw image. in Python. I have 100 image files in a folder, and I try out the following alternative scenarios : (1) Pack them into a leveldb and confirm that the testing error (from C++) is around 72% (2) Try them out in Python, using the Classifier.predict function, and the accuracy drops to 42% ! Note that I have around 200 target classes. x=caffe.io.load_image(filename,color=False)
scores = net.predict([x])
filters = net.blobs['prob'].data
f=filters.reshape(200)
return scores.reshape(200).argsort()[-1] |
@Denominator @sayanghosh check out #816. Let's sort out the input preprocessing in that PR. |
- load an image as [0,1] single / np.float32 according to Python convention - fix input scaling during preprocessing: - scale input for preprocessing by `raw_scale` e.g. to map an image to [0, 255] for the CaffeNet and AlexNet ImageNet models - scale feature space by `input_scale` after mean subtraction - switch examples to raw scale for ImageNet models - fix BVLC#525 - preserve type after resizing. - resize 1, 3, or K channel images with special casing between skimage.transform (1 and 3) and scipy.ndimage (K) for speed
- load an image as [0,1] single / np.float32 according to Python convention - fix input scaling during preprocessing: - scale input for preprocessing by `raw_scale` e.g. to map an image to [0, 255] for the CaffeNet and AlexNet ImageNet models - scale feature space by `input_scale` after mean subtraction - switch examples to raw scale for ImageNet models - fix BVLC#525 - preserve type after resizing. - resize 1, 3, or K channel images with special casing between skimage.transform (1 and 3) and scipy.ndimage (K) for speed
- load an image as [0,1] single / np.float32 according to Python convention - fix input scaling during preprocessing: - scale input for preprocessing by `raw_scale` e.g. to map an image to [0, 255] for the CaffeNet and AlexNet ImageNet models - scale feature space by `input_scale` after mean subtraction - switch examples to raw scale for ImageNet models - fix BVLC#525 - preserve type after resizing. - resize 1, 3, or K channel images with special casing between skimage.transform (1 and 3) and scipy.ndimage (K) for speed
Hi!
I noticed in caffe.proto (around line 209):
But then in python file pycaffe.py (around line 266):
and few lines later:
Is that right?
The text was updated successfully, but these errors were encountered: