Learning Local Image Descriptors Data

These datasets are provided courtesy of Simon Winder, Matt Brown, Noah Snavely, Steven Seitz, and Richard Szeliski. You may use them freely for research purposes with acknowledgement.

This page contains links to the data used in the paper:

S. Winder and M. Brown. Learning Local Image Descriptors. To appear International Conference on Computer Vision and Pattern Recognition (CVPR2007) (pdf 300Kb)

The data is taken from Photo Tourism reconstructions from Trevi Fountain (Rome), Notre Dame (Paris) and Half Dome (Yosemite). Each dataset consists of a series of corresponding patches, which are obtained by projecting 3D points from Photo Tourism reconstructions back into the original images. See the paper for details.

Description of the data

The dataset consists of 1024 x 1024 bitmap (.bmp) images, each containing a 16 x 16 array of image patches. Here are some examples (click on the image to enlarge):

Each patch is sampled as 64 x 64 grayscale, with a canonical scale and orientation. For details of how the scale and orientation is established, please see the paper. An associated metadata file info.txt contains the match information. Each row of info.txt corresponds to a separate patch, with the patches ordered from left to right and top to bottom in each bitmap image. The first number on each row of info.txt is the 3D point ID from which that patch was sampled -- patches with the same 3D point ID are projected from the same 3D point (into different images). The second number in info.txt corresponds to the image from which the patch was sampled, and is not used at present.

Download the datasets

Follow the links below to download zipfiles for each of the 3 datasets. Each contains around 100,000 patches and is around 350Mb in size:

Trevi Notre Dame Half Dome

Contact

If you have any questions please contact Simon Winder or Matthew Brown.

Acknowledgements

We'd like to thank Noah Snavely for making his Photo Tourism reconstructions available to us.