Good way to input PASCAL-VOC 2012 training data and labels with tensorflow

Question

I want to do object detection of PASCAL-VOC 2012 dataset with tensorflow.

I want to input the whole image with object labels and the corresponding bounding boxes into the tensorflow for training.

Is there any good way to write a data file for tensorflow to read? Or just read the original XML file in tensorflow?

Thank you very much.

Here is an image example:

Xyz · Answer 1 · 2018-06-29T10:45:08.990

There are pre-made tools for that, look for Tensorflow models repository. Their approach in essence is:

Parse the xml annotation files and flatten the data structure within them.
Produce tfrecord that combines annotation and images,

this is arguably the best way.

For sake of training you can implement your own converter that takes a pair (xml,image) and saves into tfrecord example.

Tfrecord is tensorflow format for storing data, every tfrecord file is bascially a list containing examples, every example is an object that holds data in key : value pairs, where value is an array of primitive types (int, string, float) and key is a string.

So, first you flatten your xml annotation to match constraints of tfrecord file then you use tensorflow TFRecordWriter to save data into file. Check Tensorflow API - it will pay off.

score 0 · Accepted Answer · edited May 23 '17 at 11:46

It seems that TF have no support of xml files yet.

You can try to make batches by yourself and feed them to TF placeholders. https://www.tensorflow.org/versions/r0.10/how_tos/reading_data/index.html#feeding
You can write your own file format and your own decoder. Then you can read file and get file bytes with tf.decode_raw function and do whatever you whant. Related question if you whant to read multiple files simultaneously: Tensorflow read images with labels

I think that first option is easier to implement.

score -1 · Answer 3 · answered Jun 04 '18 at 21:48

-1

First use labelImg-master to convert the boxed pictures into VOC Annotated format the use my utility from the link below to convert the VOC Annotated Files to npz .npz is a very good format and performance efficient way to store both data and label for image processing using KERAS on Tensorflow.

Below is the code to convert any PASCAL VOC ANNOTATED format files to npz.

https://github.com/MATRIX4284/VOC_NPZ

answered Jun 04 '18 at 21:48

Kaustav Mukherjee

1

Welcome to Stack Overflow! Generally, links to a tool or library [should be accompanied by usage notes, a specific explanation of how the linked resource is applicable to the problem, or some sample code](http://meta.stackoverflow.com/a/251605), or if possible all of the above. – Morse Jun 04 '18 at 21:55

Good way to input PASCAL-VOC 2012 training data and labels with tensorflow

3 Answers3