This blog post summarizes the interior scene dataset built when I was a research intern at VCLA@UCLA together with code for batch transformation of 3D models.
Computer Vision has nowadays evolved to a stage where most practical solutions to classical vision problems rely heavily on data. One typical example in 2D that advances both the field of deep learning and computer vision has been the ImageNet  dataset and the closely related Large Scale Visual Recognition Challenge, a.k.a. LSVRC , where researchers world-wide compete and compare performance of their algorithms to improve the upper limit of machines’ ability to understand visual information from the real world. We believe that the same principle applies to 3D and more broadly, virtual world from Virtual Reality  or Augmented Reality , especially ones that robot agents reside so that resources required to train robots could be reduced to a much lower level, making it easier and cheaper. These in mind, we built the SceneSet-50 dataset with the hope that relevant research community could benefit from it on problems regarding scene understanding, 3D modelling, virtural reality and so on.
The SceneSet-50 dataset maintains 50 complete scenes, convering 5 cultural backgrounds – American, Chinese, European, Japanese and Mediterranean – and 5 room types – bedroom, kitchen, laundry, living room and study, each of which has two instances, typically one traditional and one modern. Normally, every scene has more than 10 objects, some of which have even interactive components, such as doors, drawers, windows that open. We hope that the open-sourcing of the dataset will enable advanced research and progress in related computer vision directions, such as 3D reconstruction, virtual reality, scene understanding and especially robotics, where agents could be trained using the scenes without costly scene construction and catastrophic damage in reality.
To give a feel of how the scenes look like
Figure 1. A bedroom example.
Figure 2. A study example.
Figure 3. A rendered example.
Figure 4. Depth feature map.
We now welcome you to use our dataset for computer science research and even fields outside computer science, such as cognitive science.
The directory contains the original models built by SketchUp and exported models in OBJ format, together with a guide on how to maintain and modify the dataset.
3. Batch Automatic Transformation of 3D Models
The batch automatic transformation script converts models of any file format into OBJ, aligns the center of the model’s bounding box to the origin of the world coordinates, applies the transformation specified in a file and renames the texture files to make it easier for rendering engine.
The code resides here.
- Windows (currently working only under Windows)
- MeshLab v1.3.4 BETA
To run the script, first add the meshlabserver to your environment path and then navigate to the directory where your models are stored. Assuming the transformation is prepared, you can simply run the script or import it and call function main as follows
import utils utils.main("your_model_name", "template.mlx", "your_transformation")
An example is already provided in the repo: the *_ml*s are the script output.
The script will first convert your model into OBJ, read the transformation specified in the transformation file, fill the template to be fed to the meshlabserver, rename the textures, and update the references to the textures in the OBJ’s mtl file.
Note that the transformed file should be stored in the same directory and a model-based script will be generated. The model-based script will also triangulate all the meshes in your model.
If your model, after conversion to OBJ, does not have texture files, it’s either that the model just doesn’t have textures or that texture mapping is incorporated inside the .obj file by vertex coloring. Some software might not support this per-vertex color OBJ format.
So it’s strongly encouraged to use OBJ models in the beginning so that the pipeline works well.
 Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Fei-Fei, L. (2009, June). Imagenet: A large-scale hierarchical image database. In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on (pp. 248-255). Ieee.
 Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., … & Berg, A. C. (2015). Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115(3), 211-252.
 Wikipedia. Virtual realtiy..
 Wikipedia. Augmented realtiy..